Automating Dynamic Prerendering at Halodoc: A Data-Driven Approach to SEO and Faster Page Discovery
At Halodoc, helping users discover reliable healthcare information quickly is core to our mission. For our public-facing web platform, SEO is not only a growth lever but also an engineering responsibility that directly impacts accessibility, performance, and content discovery. When users search for healthcare information, faster page discovery and rendering can meaningfully improve their experience.
Historically, we used prerendering to make key public pages faster to load and easier for search engines to index. We’ve previously discussed the fundamentals of Prerendering in Angular and why it’s a game-changer for performance. While modern Angular supports Server-Side Rendering (SSR), a purely client-rendered SPA still ships a minimal HTML shell and relies heavily on browser-side JavaScript to display content. Prerendering bypasses this entirely by generating the complete HTML at build time. This gives crawlers and users instant, indexable content without waiting for client-side hydration. But deciding which pages to prerender is where the real engineering challenge begins.
The Problem: Hardcoded Routes and the "Static" Trap
Initially, our approach to prerendering dynamic routes (like specific articles or medication pages) was entirely manual. We had a static list of URLs hardcoded directly into our server routing logic. This created several pain points:
- Stale Data: Health trends shift rapidly from week to week. A sudden spike in local dengue cases can completely alter user search intent in a matter of days. Manual updates simply couldn't keep up with this high-velocity traffic data.
- Code Bloat: Our server route files were becoming cluttered with long lists of static IDs.
- High Risk: Every time we wanted to update the prerender list, an engineer had to manually modify core routing files, increasing the risk of syntax errors.
We needed a system that could adapt to search trends autonomously.
The Solution: A Data-Driven Pipeline
We realized that the most accurate source of "what matters right now" is the Google Search Console (GSC) API, our real-time traffic reporter. We built an automation script that fetches search performance data and refreshes our application's prerender configuration automatically.
How the Script Works
The script acts as a bridge between Google’s search analytics and our codebase. To break it down, here is exactly what the code does:
- Logs In: It uses a secure key to authenticate with Google Cloud.
- Asks a Question: It queries Google, "Which specific articles and medicine pages received the most clicks over the last 7 days?"
- Filters the Winners: It drops the low-traffic pages and keeps only the absolute top performers (our "Smart Threshold").
- Saves the List: It takes those winning pages and saves them into an updated, machine-readable list in our codebase.
Here is an excerpt of our core logic:
// scripts/run-gsc-prerender.mjs
import { google } from 'googleapis';
import { writeFileSync } from 'node:fs';
async function generatePrerenderConfig() {
// 1. Authenticate using Service Account key
const auth = new google.auth.GoogleAuth({
keyFile: process.env.GSC_KEY_FILE,
scopes: ['https://www.googleapis.com/auth/webmasters.readonly'],
});
const searchconsole = google.searchconsole({ version: 'v1', auth: await auth.getClient() });
const { start, end } = getDateRange(); // Evaluates the recent traffic window
const params = { kesehatan: [], obatSlug: [], obatCategory: [], artikel: [] };
// 2. Query GSC for each dynamic route path using Regex
for (const query of PATH_QUERIES) {
const res = await searchconsole.searchanalytics.query({
siteUrl: 'https://www.halodoc.com',
requestBody: {
startDate: start,
endDate: end,
dimensions: ['page'],
dimensionFilterGroups: [{
filters: [{ dimension: 'page', operator: 'includingRegex', expression: query.regex }]
}],
rowLimit: 10000,
},
});
// 3. Process routes meeting the "Smart Threshold" (e.g., minimum 10 clicks)
const rows = res.data?.rows ?? [];
const sorted = [...rows].sort((a, b) => (b.clicks ?? 0) - (a.clicks ?? 0));
for (const row of sorted) {
if (row.clicks >= query.limit) {
const value = extractParam(row.keys?.[0], query.prefix);
if (value) {
params[query.name].push({ [query.paramKey]: value });
}
}
}
}
// 4. Write output to configuration layer
writeGeneratedFile(params);
}Decoupling with a Configuration Layer (The Constant File)
Once the script finds the top pages, it writes them into a constant file.
By doing this, we separate the data from the engine. Instead of burying our top SEO links deep inside complex server routing code, we save them into a simple, standalone list. This means our automated script can safely overwrite this file every week without any risk of breaking the core application. The server simply reads whatever is on the list today.
// src/app/prerender-routes/gsc-prerender-params.constant.ts
// Generated by scripts/run-gsc-prerender.mjs - do not edit by hand.
export const gscPrerenderParamsData = {
kesehatan: [
{ nama: "dengue-fever-symptoms" },
{ nama: "flu-care-guide" }
],
obatSlug: [
{ slug: "paracetamol-500mg" }
],
obatCategory: [
{ category: "vitamin-c" }
],
artikel: [
{ slug: "manfaat-vitamin-c" }
],
};Under the Hood: Integrating with GCP
To make this secure and automated, we configured a strict gateway in the Google Cloud Platform (GCP):
- Service Account Creation: We created a dedicated Service Account in the GCP Console. Unlike a personal user account, a Service Account is designed for "machine-to-machine" communication, making it perfect for our weekly Jenkins automation.
- Key Generation: We generated a JSON Key for this service account. Crucially, this key is securely injected via the Jenkins credentials store and is never committed to our repository.
- Permissions (The "GSC Handshake"): Simply having a GCP account isn't enough. We went into the Google Search Console settings for the Halodoc domain and added the Service Account's email address as a "User" with Restricted (read-only) permissions. Since the script only needs to execute
searchanalytics.query, read-only access follows the principle of least privilege perfectly.
The Automation Loop (Jenkins + Git)
To ensure our prerendered pages benefit users at the exact moment of intent, we use a weekly Jenkins cron job restricted to a rolling 7-day traffic window. By evaluating only the last week of data, the system remains hyper-sensitive to immediate breakouts. last week of data, the system remains hyper-sensitive to immediate breakouts.
The lifecycle of this weekly update follows a strict logic:
- Trigger: At the start of every week, a Jenkins cron job executes the automation script.
- The Decision Point: The script compares the past 7 days of data against the existing configuration file. If the traffic patterns have not shifted significantly, the job ends quietly.
- Auto-Proposal: If new trending routes are identified, the script automatically creates a Merge Request (MR).
- Instant Notification: To maintain momentum, the job posts to our Web team's space and tags the on-call engineers.
- First Responder Model: The first available engineer acknowledges the MR, performs a quick sanity check, and deploys the update.
The Challenge of Scale
The "Too Many Pages" Problem
When stakeholders learn about the benefits of prerendering (instant load times, perfect SEO), the immediate question is usually: "Why don't we just prerender every single page on the website?"
For a small static site, that is exactly what you do. But for a massive healthcare platform like Halodoc, we have thousands of dynamic routes—ranging from thousands of specific medication variants to a vast library of health articles. Generating a static HTML file for every possible route every time we deploy a code change is simply not feasible at our scale.
The Trade-off: Managing the Build Time Tax
While prerendering is fantastic for user experience and SEO, it comes with a strict engineering cost: build time. Baking thousands of pages in advance would bloat our CI/CD pipeline, turning a rapid 90-second deployment process into a dangerously slow crawl. Every additional page added to the prerender array is a literal "tax" on our release velocity. We needed a way to get the maximum SEO benefit with the minimum build-time tax.
Our Strategy: The "Smart Threshold"
To mitigate the risks of bloated build times, we did not just automate the process; we also automated the prioritization. The automation script ranks every page in each dynamic route category by clicks and keeps only the top performers.
Currently, we apply a static baseline cap of the top 10 pages per category. Anything outside this cut falls back to standard Server-Side Rendering (SSR). These pages remain fully indexable by crawlers and are still rendered on demand; they simply are not pre-baked into the build output. We reserve the "Express Lane" for the specific pages that actually move the bulk of our organic traffic.
The Future: Traffic-Weighted Calculations
While a static cap of 10 pages keeps our current CI/CD pipeline agile (finishing in roughly 70 to 90 seconds), it is only our starting point. Our next engineering milestone is to replace this static cap with a dynamically calculated threshold. In the future, the system will autonomously decide how many pages to prerender by weighing a page's click velocity against its projected build-time cost, scaling the output up or down based on actual demand rather than a hardcoded limit.
The Workflow Architecture
To bridge the gap between search data and production performance, we implemented an automated pipeline that connects Google’s analytics directly to our deployment workflow. This loop ensures our site remains optimized without manual intervention.
The pipeline follows a clear, linear progression to ensure data integrity and build stability:
- Trigger: A weekly Jenkins cron job initiates the process every Monday.
- Data Extraction: The script queries the Google Search Console API for the top-performing pages from the last 7 days.
- Prioritization: Pages are ranked by clicks per category. We then apply our Smart Threshold to keep only the top 10 performers per dynamic route.
- File Generation: The script regenerates the gsc-prerender-params.constant.ts file with the fresh data.
- Review: Jenkins opens an automated Merge Request and notifies the team via Google Chat for a quick sanity check
The Impact in Numbers
| Feature | Old Way (Hardcoded) | New Way (Automated) |
| Logic Location | Mixed with Server Routes | Decoupled in .constant.ts |
| Data Accuracy | Manual / Outdated | Real-time (Last 7 days) |
| Engineering Effort | Hours of manual work per cycle | < 5 minutes (Review only) |
| Human Error | High (Manual typing) | Near-zero (Eliminated manual typing) |
The results highlight a significant improvement in both operational efficiency and system performance:
- Engineering Velocity: What was once a tedious manual audit is now a seamless automated Merge Request. We have effectively eliminated manual engineering effort on route-list maintenance and shifted the human role solely to a brief final approval.
- Build Stability: We achieved a highly optimized build envelope by capping the list at the top 10 pages per dynamic route category. A full production build completes in 70 to 90 seconds. In our performance testing on identical hardware with a cold Angular cache, we observed only a marginal delta between the master branch (72s) and the gsc-refresh branch (91s).
- Data Freshness: Our prerendered content is now synchronized with current user behavior. The lag between emerging traffic trends and our site configuration has been reduced from a 3 to 6 month window to a maximum of 7 days.
Building Your Own: The Halodoc AI Skill
Since keeping prerender targets updated is a common challenge for many large-scale web applications, we didn't want to just solve it for ourselves. We've packaged the operational logic for this entire data pipeline into a custom, shareable AI skill. You can find the full blueprint in our Halodoc AI Skills Repository.
While the core prompts and architectural blueprints can be adapted for practically any AI platform or coding assistant, the step-by-step example below demonstrates how to install and run it natively using Claude Code.
Step-by-Step Installation (Claude Code Example)
If you use Claude Code as your terminal assistant, you can pull this skill directly into your local CLI workspace. We handle this via an ultra-fast, resource-friendly Git sparse checkout.
Open your terminal and run the following commands to clone the target directory and import the skill:
# Clone only the specific skill folder (sparse checkout — no need to download the whole repo)
git clone --filter=blob:none --sparse https://github.com/halodoc-tech/halodoc-ai.git /tmp/halodoc-ai
cd /tmp/halodoc-ai
git sparse-checkout set Skills/web/gsc-driven-prerendering
# Copy it into the local skills directory
cp -r Skills/web/gsc-driven-prerendering ~/.claude/skills/gsc-driven-prerenderingOnce the files are moved, restart your Claude Code console. The newly registered command will be active and ready to handle tasks:
/gsc-driven-prerenderingStep 1: Project Detection (The Safety Check)
Before writing a single file, the AI skill acted like a senior engineer checking the environment. It scanned our codebase to confirm two prerequisites: that Angular’s Server-Side Rendering was active, and that our server routes were already configured for prerendering. It also checked to see if an old script already existed, determining it needed to safely build the system from scratch.
Step 2: Generating the Runner Script
The skill generated a custom scripts/run-gsc-prerender.mjs script to securely talk to Google. It was smart enough to infer our production URL ([https://www.halodoc.com](https://www.halodoc.com)) from existing files. It then set up a strict budget: limiting the prerender to the top 10 pages per route. Since every extra page adds time to the build, budgeting conservatively is crucial. Finally, it wrote precise URL filtering rules to ensure article traffic data wouldn't get mixed up with medication data.
Step 3: Wiring the Server Routes
Next, the AI opened our core routing file (app.routes.server.ts). It ripped out the old, manually typed list of URLs and replaced it with a dynamic reference pointing directly to the fresh data generated by the Google script.
Step 4: Seeding the Constants File (Protecting Local Builds)
Running the live Google script requires a highly secure API key, which isn't available on local developer laptops. To ensure the codebase wouldn't break for engineers running the app locally, the skill automatically generated an initial "seed" file, pre-populated with our legacy routes. Bonus: While creating this seed file, the AI silently noticed and removed a duplicate article (jangan-panik-ini-cara-mengatasi-kucing-yang-sering-muntah) that a human had accidentally pasted twice in the old manual list!
Step 5: Alignment Verification
A silent mismatch between an Angular route name and the script's tracking keys wouldn't cause a noticeable crash—the system would just skip prerendering that page entirely. To prevent this, the skill ran targeted alignment checks, cross-referencing the parameter names across all files to confirm they matched perfectly before completing its session.
By letting an LLM handle the boilerplate Google API authentication and regex filtering, your team can focus entirely on wiring up the CI/CD pipeline and adjusting the "Smart Threshold" to fit your specific build-time constraints. Check out the repository for the complete template and start automating your own SEO pipeline today!
Conclusion
This project was about more than just speed; it was about engineering maturity. By decoupling our data and automating our insights, we've reclaimed engineering time and ensured that Halodoc remains the fastest way for users to find the healthcare information they need. Looking ahead, we plan to experiment with more granular traffic segments and explore real-time triggers to capture emerging search trends, ensuring Halodoc’s performance is always synchronized with the current needs of our users.
Reference
Join us
Scalability, reliability and maintainability are the three pillars that govern what we build at Halodoc Tech. We are actively looking for engineers at all levels — if solving hard problems with challenging requirements is your forte, please reach out with your résumé at careers.india@halodoc.com.
About Halodoc
Halodoc is the number one all-around healthcare application in Indonesia. Our mission is to simplify and deliver quality healthcare across Indonesia, from Sabang to Merauke.
Since 2016, Halodoc has been improving health literacy in Indonesia by providing user-friendly healthcare communication, education, and information (KIE). In parallel, our ecosystem has expanded to offer a range of services that facilitate convenient access to healthcare, starting with Homecare by Halodoc as a preventive care feature that allows users to conduct health tests privately and securely from the comfort of their homes; My Insurance, which allows users to access the benefits of cashless outpatient services in a more seamless way; Chat with Doctor, which allows users to consult with over 20,000 licensed physicians via chat, video or voice call; and Health Store features that allow users to purchase medicines, supplements and various health products from our network of over 4,900 trusted partner pharmacies. To deliver holistic health solutions in a fully digital way, Halodoc offers Digital Clinic services including Haloskin, a trusted dermatology care platform guided by experienced dermatologists.
We are proud to be trusted by global and regional investors, including the Bill & Melinda Gates Foundation, Singtel, UOB Ventures, Allianz, GoJek, Astra, Temasek, and many more. With over USD 100 million raised to date, including our recent Series D, our team is committed to building the best personalized healthcare solutions — and we remain steadfast in our journey to simplify healthcare for all Indonesians.