Accelerating File Uploads at Halodoc: Performance, Reliability & Smart Optimization

Cloud Engineering Dec 5, 2025

Introduction

At Halodoc, seamless document uploads underpin patient care across our digital health platform. The system now handles 200,000 to well over 300,000 uploads per day, supporting physicians, hospital staff, and patients who depend on reliable uploads of medical images, prescriptions, and reports—often from regions with variable internet connectivity. As usage surged and file sizes grew, ensuring a fast, cost-efficient, and secure upload process became critical to both user experience and operational resilience.

Business Impact:

By upgrading our upload architecture, we delivered:

85% drop in p99 upload latency
40.7% improvement in average latency
55% throughput increase
Scalable handling of peak traffic
Enhanced patient safety by reducing upload errors and bottlenecks

Technical Challenges

Our system reliably streamed files to S3, but we identified several critical opportunities to improve performance and resilience:

1. WebP Image Conversion Bottlenecks

Medical images, prescriptions, and diagnostic scans dominate uploads, making compression efficiency key. Our initial WebP conversion, while effective, blocked other requests and taxed server CPUs, especially on large files. This led to p99 latencies of 1.28 seconds, with uploads sometimes exceeding 4 minutes.

2. Server Resource Utilization Under Load

Concurrent streaming, validation, and transformation of large documents strained memory and CPU during peak periods. WebP conversion alone could block threads for up to 23 seconds on a 13 MB file, occasionally degrading service for users uploading records at busy times.

3. Handling large files in S3 efficiently:

Legacy single-part S3 uploads were slow for files over 50 MB (15+ seconds for 50 MB, 22+ seconds for 100 MB), creating risks of timeouts and re-uploads. Backend-driven synchronous transfers also limited our ability to scale during high-volume periods.

Our Optimization Journey

Our optimization journey unfolded in three phases — from accelerating image conversion with WebP, to optimizing large file uploads using S3 multipart uploads, and finally scaling efficiently with presigned URL–based client uploads.

1. WebP Conversion — Smarter, Faster, Lighter

Among all the optimization initiatives, image conversion emerged as the most significant performance bottleneck. Since a majority of our uploads are images, efficient image handling was central to improving overall upload latency.

Why WebP?

WebP is a modern image format developed by Google that provides both lossy and lossless compression for images on the web. It was designed to achieve superior compression efficiency while maintaining visual fidelity comparable to JPEG and PNG.

How WebP Compression Works

WebP uses advanced compression techniques—including block prediction, transform coding, entropy coding, alpha channel compression, and color space optimization—to reduce file sizes significantly.

The result — files that are typically 25–35% smaller than JPEGs and up to 80% smaller than PNGs, without perceptible quality loss. Actual compression gains vary based on image content, but we consistently saw substantial size reductions.

Google’s Native CLI (`cwebp`)

Google’s cwebp CLI, written in C++, is heavily optimized for performance and supports:

Multi-threaded encoding (-mt)
Configurable compression levels (-m 0–6)
Quality tuning (-q)
Lossless or lossy modes

Integration in Upload Flow

We integrated Google’s cwebp CLI in our file upload service. When a file is uploaded, the service:

Invokes cwebp synchronously for image files using ProcessBuilder
Waits for the process to complete
Uploads the converted .webp file to S3

WebP Conversion Optimization Discovery

While reviewing cwebp conversion, we found two parameters that can be fine-tuned for better performance:

Flag	Description	Effect
`-mt`	Enables multithreading	Utilizes all CPU cores for block encoding
`-m 0`	Sets compression method to 0 (fastest)	Prioritizes speed over minimal size gain

How It Works

With -mt, cwebp splits the image into macroblocks and compresses them in parallel threads.
This allows cwebp to fully utilize all CPU cores instead of a single thread, drastically improving throughput.

The default mode (-m 4) represents a balanced trade-off between encoding speed and compression efficiency. For our documents where minor quality differences aren't visually significant, we didn't need this balance.

With (-m 0), we shift heavily toward speed: it uses simpler encoding decisions and skips advanced compression passes (like trellis quantization and enhanced filtering) that marginally improve file size but significantly increase encoding time. For document uploads where users are waiting, speed matters more than saving an extra 2-3% on file size.

So we ran a few experiments using the same input set (1 MB – 100 MB PNGs) with and without these command changes, and the effect was massive.

WebP Conversion Performance: Default CLI vs Optimized CLI

File Size	Default CLI	Optimized CLI (`-mt -m 0`)	Improvement
1 MB	133 ms	39 ms	70.7% faster
10 MB	542 ms	223 ms	58.9% faster
50 MB	2714 ms	1045 ms	61.5% faster
100 MB	13.5 s	2.0 s	84.9% faster

Optimized CLI delivered 60–95% faster conversion, with negligible changes in output size.

Note: Benchmarks were conducted on production-grade AWS EC2 instances with multi-core CPUs. The -mt flag’s effectiveness scales with available CPU cores, so actual improvements may vary depending on hardware.

2. S3 Multipart Upload — Handling Large Files Efficiently

After optimizing WebP conversion, our next focus was on improving uploads of large files to S3. Big files could cause latency spikes and block server threads when uploaded synchronously.

How We Implemented Multipart Upload

AWS SDK’s TransferManager automatically handles multipart uploads for files above a configured threshold. It splits files into parts, uploads them concurrently using multiple threads, retries failed parts automatically, and combines all parts into a single S3 object. This improves throughput, reduces memory usage, and simplifies retry logic.

Why This Approach Improves Performance

Parallelism: Multiple parts upload concurrently, increasing throughput.
Memory Efficiency: Files are streamed in parts, so large files don’t consume excessive memory.
Automatic Retry: Only failed parts are retried; successful parts are not re-uploaded.
Simplicity: No need to manually manage file splits, threads, or retry logic — TransferManager handles it.

Performance Comparison: S3 Multipart vs Traditional Uploads

We benchmarked single-part and multipart S3 uploads across PDF files from 5 MB to 100 MB. The findings were consistent:

Small files (5–20 MB):
Multipart sometimes performed similarly or slightly slower due to coordination overhead — expected for smaller payloads.
Large files (50–100 MB):
Multipart uploads delivered stable gains, typically 20–58% faster, with added benefits like automatic retries, better resilience, and lower memory pressure.
Why do we use multipart above 20 MB:
Even when raw speed improvements are modest, multipart prevents full-file retries on failures and ensures smoother, more reliable uploads under varying network conditions.

Note: We initially implemented this using AWS SDK v1's TransferManager and have since migrated to AWS SDK v2's S3TransferManager. Both versions provide similar multipart upload capabilities. The core concepts discussed here apply to both versions.

3. Presigned URL Upload — Asynchronous, Scalable File Uploads

To remove backend bottlenecks altogether, we introduced presigned URL uploads, allowing clients to upload directly to S3 while the backend only handles metadata.

How It Works

Client Requests Presigned URL
Client requests a presigned URL. The backend generates a URL that allows the client to upload directly to S3.
Client Uploads File Directly to S3
Using the presigned URL, the client uploads the file. This step bypasses the backend, reducing memory and CPU usage.
Backend Marks Upload Complete
The client calls to update the status of the upload completion. Backend verifies the file and marks the document to active.

Security Best Practices for Presigned URLs

To ensure presigned URLs remain secure and tamper-proof, we implemented three safeguards:

Short Expiration Windows: Presigned URLs expire within a configurable timeframe tailored to each use case, limiting the window for unauthorized use while ensuring legitimate uploads complete successfully.
File Integrity Verification: A checksum is computed on the client side during presigned URL generation and verified by S3 during upload. This ensures the file isn’t corrupted or tampered with during transmission.
Post-Upload Status Check: After the upload completes, the backend validates that the file exists in S3 and matches the expected metadata before marking the document as active. This prevents incomplete or failed uploads from being treated as successful and ensures a consistent system state.

Benefits of Presigned URL Upload

Async & Non-blocking: Backend does not stream the file, reducing memory and CPU load.
Scalable: Clients handle the upload directly; backend only manages metadata.
Resilient: Upload retries and network fluctuations are handled by the client-S3 interaction.

This approach complements synchronous multipart uploads: large files that require backend processing can still use multipart upload, while presigned URLs support fully decoupled, asynchronous workflows.

Summary — A Faster, Smarter Upload Experience

At Halodoc, our document upload system had to evolve to keep pace with growing usage and larger files. Here’s how we transformed it:

Smarter Images: WebP conversion with multithreading made image uploads faster and lighter — smaller files without losing quality.
Seamless Large File Handling: Multipart uploads let us stream big files efficiently, upload parts in parallel, and automatically retry failed chunks.
Async, Scalable Uploads: Presigned URLs empower clients to upload directly to S3, reducing backend load while maintaining control and reliability.

Making a Difference for Healthcare

For Patients:
Faster uploads of medical reports mean quicker diagnoses, follow-ups, and care decisions.

For Doctors & Hospitals:
Reliable uploads reduce rework, prevent delays, and build trust in digital workflows.

Business Wins:

Lower infrastructure costs through more efficient data transfer
Fewer upload failures → higher user satisfaction and fewer support tickets

Join Us

Scalability, reliability and maintainability are the three pillars that govern what we build at Halodoc Tech. We are actively looking for engineers at all levels, and if solving hard problems with challenging requirements is your forte, please reach out to us with your resumé at careers.india@halodoc.com.

About Halodoc

Halodoc is the number one all-around healthcare application in Indonesia. Our mission is to simplify and deliver quality healthcare across Indonesia, from Sabang to Merauke.
Since 2016, Halodoc has been improving health literacy in Indonesia by providing user-friendly healthcare communication, education, and information (KIE). In parallel, our ecosystem has expanded to offer a range of services that facilitate convenient access to healthcare, starting with Homecare by Halodoc as a preventive care feature that allows users to conduct health tests privately and securely from the comfort of their homes; My Insurance, which allows users to access the benefits of cashless outpatient services more seamlessly; Chat with Doctor, which allows users to consult with over 20,000 licensed physicians via chat, video or voice call; and Health Store features that allow users to purchase medicines, supplements and various health products from our network of over 4,900 trusted partner pharmacies. To deliver holistic health solutions in a fully digital way, Halodoc offers Digital Clinic services, including Haloskin, a trusted dermatology care platform guided by experienced dermatologists.
We are proud to be trusted by global and regional investors, including the Bill & Melinda Gates Foundation, Singtel, UOB Ventures, Allianz, GoJek, Astra, Temasek, and many more. With over USD 100 million raised to date, including our recent Series D, our team is committed to building the best personalised healthcare solutions, and we remain steadfast in our journey to simplify healthcare for all Indonesians.

Recommended for you

Pagination

A practical guide to scalable pagination: Offset, Keyset and Beyond

8 months ago • 7 min read

migration

How Halodoc built Scalable Auth service with JWT while also saving cost

a year ago • 10 min read

Turning CI Noise into Insights: Our Build Monitoring Architecture

Securing File Uploads: How We Built a Multi-Layer Validation System at Halodoc

Implementing Apache Yunikorn on EMR on EKS at Halodoc

Angular 20 Migration: SSR Challenges & Resolutions with Vite, ESM, and Express 5

Accelerating File Uploads at Halodoc: Performance, Reliability & Smart Optimization

Introduction

Business Impact:

Technical Challenges

1. WebP Image Conversion Bottlenecks

2. Server Resource Utilization Under Load

3. Handling large files in S3 efficiently:

Our Optimization Journey

1. WebP Conversion — Smarter, Faster, Lighter

Why WebP?

How WebP Compression Works

Google’s Native CLI (`cwebp`)

Integration in Upload Flow

WebP Conversion Optimization Discovery

How It Works

WebP Conversion Performance: Default CLI vs Optimized CLI

2. S3 Multipart Upload — Handling Large Files Efficiently

How We Implemented Multipart Upload

Why This Approach Improves Performance

Performance Comparison: S3 Multipart vs Traditional Uploads

3. Presigned URL Upload — Asynchronous, Scalable File Uploads

How It Works

Security Best Practices for Presigned URLs

Benefits of Presigned URL Upload

Summary — A Faster, Smarter Upload Experience

Making a Difference for Healthcare

Join Us

About Halodoc

Tags

Santosh Jaiswal

How Halodoc built Scalable Auth service with JWT while also saving cost

Turning CI Noise into Insights: Our Build Monitoring Architecture

Securing File Uploads: How We Built a Multi-Layer Validation System at Halodoc

Implementing Apache Yunikorn on EMR on EKS at Halodoc

Angular 20 Migration: SSR Challenges & Resolutions with Vite, ESM, and Express 5

Introduction

Business Impact:

Technical Challenges

1. WebP Image Conversion Bottlenecks

2. Server Resource Utilization Under Load

3. Handling large files in S3 efficiently:

Our Optimization Journey

1. WebP Conversion — Smarter, Faster, Lighter

Why WebP?

How WebP Compression Works

Google’s Native CLI (cwebp)

Integration in Upload Flow

WebP Conversion Optimization Discovery

How It Works

WebP Conversion Performance: Default CLI vs Optimized CLI

2. S3 Multipart Upload — Handling Large Files Efficiently

How We Implemented Multipart Upload

Why This Approach Improves Performance

Performance Comparison: S3 Multipart vs Traditional Uploads

3. Presigned URL Upload — Asynchronous, Scalable File Uploads

How It Works

Security Best Practices for Presigned URLs

Benefits of Presigned URL Upload

Summary — A Faster, Smarter Upload Experience

Making a Difference for Healthcare

Join Us

About Halodoc

Tags

Santosh Jaiswal

Recommended for you

Kubernetes Optimization using In-Place Pod Resizing and Zone-Aware Routing

A practical guide to scalable pagination: Offset, Keyset and Beyond

How Halodoc built Scalable Auth service with JWT while also saving cost

Google’s Native CLI (`cwebp`)