Why the order matters
Each step inherits the previous step's freedom. Step 1 (mobile API) gives you JSON over a permissive HTTP endpoint at the cost of one afternoon learning HTTPToolkit. Step 2 (XHR) gives you JSON over a possibly-protected HTTP endpoint. Step 3 (JSON-in-HTML) gives you the same data as a string parse with no browser. Steps 4–6 cost progressively more infrastructure and budget.
The cost ladder is real. Step 4 needs residential proxies (~$3–10/GB). Step 5 needs a patched-browser binary plus 200MB RAM per instance plus proxies. Step 6 is per-request pricing on managed APIs ($0.20–$3 per 1,000). Starting at step 5 when step 1 would have worked is a recurring waste of engineering time — but it is what scrapers do when they don't consciously walk the flow.
Step-by-step with confirmed bypasses
Step 0 — Recon. Install Wappalyzer (Chrome extension) and visit the target. It identifies anti-bot vendor in one click. Or run wafw00f https://target.com from CLI. With Burp Suite MCP attached to Claude Code, one prompt traces cookie lifecycle and recommends the bypass step.
Step 1 — Mobile API. Rooted Android Studio AVD + HTTPToolkit. The mobile app often hits a separate backend with weaker bot protection. Confirmed in production: a direct GraphQL endpoint from a major retailer's mobile app bypassed the entire web-side Akamai + DataDome stack.
Step 2 — XHR. Chrome DevTools → Network → Fetch/XHR. Many SPAs load all data from one undocumented JSON endpoint. Confirmed: a single GraphQL endpoint bypassed all of one retailer's HTML anti-bot.
Step 3 — JSON in HTML. Next.js sites embed full state in __NEXT_DATA__. React SPAs often have window.__INITIAL_STATE__. Confirmed: Grainger.com ships 110KB of product data in __NEXT_DATA__, bypassing DataDome entirely because no JS executes.
Step 4 — HTTP + curl_cffi. impersonate="chrome131" + residential proxy. Resolves ~60% of Akamai targets where sensor.js scoring is light, almost all medium Cloudflare, most DataDome XHR endpoints.
Step 5 — Patched browser. Camoufox (reported 100% Cloudflare pass rate March 2026 on Instagram, Reddit, X, LinkedIn), CloakBrowser (Akamai's 60-extension probe), PatchRight (Kasada). Each addresses a specific layer JS-level stealth cannot reach.
Step 6 — Managed API. F5 Shape specifically — the custom JS VM makes DIY impractical. Above ~2 engineer-days/month of bypass maintenance, the managed API is cheaper than the engineer.
Cost progression — when to escalate
| Step | Cost | Maintenance burden |
|---|---|---|
| 1 — Mobile API | Free | Low (token refresh) |
| 2 — XHR / GraphQL | Free | Low–medium |
| 3 — JSON-in-HTML | Free | Low |
| 4 — HTTP + curl_cffi | Proxy only (~$2–10/GB residential) | Medium (TLS profile rotation) |
| 5 — Patched browser | Proxy + 200MB RAM/instance | Medium–high (per-target tuning) |
| 6 — Managed API | $0.20–$3 per 1,000 requests | Zero |
