The toolchain (free, ~30 minutes to set up)
Everything you need is free. Here is the full setup:
- Android Studio + AVD. An AVD is an Android Virtual Device — a phone emulator running on your computer. Create one with API 30+ (Android 11+). Avoid API 28 — the rooting scripts do not support it.
- rootAVD (github.com/newbit1/rootAVD). Rooting gives you admin control over the emulator; this is a one-command script that does it. Afterward, confirm Magisk (the root manager) shows up in the app drawer.
- HTTP Toolkit (free, httptoolkit.com). This is the tool that records the app's traffic. Open it → Intercept → "Android device via ADB". It auto-detects the running AVD and installs its own trusted certificate so it can read the encrypted traffic.
- Install the target app via Google Play on the AVD, or sideload the APK (the Android install file) from apk.support.
- Use the app while HTTP Toolkit captures. Filter by the target domain. The requests that actually return data are usually about a dozen out of hundreds — search, listing, detail.
- Replicate in Python. Right-click any captured request → copy as cURL → import into Postman → confirm it returns data → port to curl_cffi.
Step-by-step: intercepting an Android app
Android is the easier platform to intercept because Google publishes it as open source, including emulator images that accept certificates you install yourself. (A certificate is what lets the proxy read encrypted traffic.) The full workflow:
- Install Android Studio and create an emulator using an image without Google Play. Play images run Play Integrity attestation — a check that the device is untampered — and refuse to launch once you add a custom certificate. The plain AOSP system images (API 30+) work without that restriction.
- Start
mitmproxyon your computer:mitmproxy --mode regular --listen-port 8080. mitmproxy is a proxy that sits between the app and the server so you can see the traffic. Note the host IP the emulator can reach (usually10.0.2.2). - Point the emulator at the proxy: Settings → Network → set proxy to
10.0.2.2:8080. Openmitm.itin the emulator browser, download the Android cert, and install it via Settings → Security → User certificates. - Install the target app. If it fails here, the cause is almost always certificate pinning — the app refuses to talk to a server whose certificate it doesn't already recognise. See the next section.
- Use the app normally. The
mitmproxyconsole shows every request and response, so the endpoint, headers, how requests are signed, and how pages are paged through all become visible right away. Common finds: GraphQL endpoints, signed JWT auth tokens (compact, self-contained login tokens) that expire after an hour, and unprotected list endpoints that only need a couple of mobile-specific headers.
For an app you are permitted to inspect, the result of this exercise is a clear picture of how the mobile backend is structured, even where the company's web stack uses a separate anti-bot product.
How certificate pinning affects traffic inspection
Roughly half of mainstream apps pin their TLS certificate. Pinning means the app has the expected server certificate's fingerprint baked in and refuses to talk to anything else. So a proxy certificate is ignored, the app shows a network error, and traffic inspection does not work on a pinned app.
Frida is a well-known instrumentation tool sometimes used in mobile app testing to observe how pinning checks behave at runtime. On Android, pinning is commonly implemented through okhttp3.CertificatePinner and javax.net.ssl.TrustManagerFactory. Flutter apps put their pinning in the Dart layer rather than Java. iOS apps use a different stack again.
If pinning lives in native code (rare, but it happens in banking apps), inspection is much harder. At that point the effort often exceeds what a managed scraping API would cost, and the decision flow suggests moving back up the ladder. Note that bypassing pinning on apps you do not own or are not authorized to test may violate the app's terms and applicable law.
What to record before disconnecting the proxy
Once you have a captured session, write down all of this before the session expires — you are documenting the API so you never have to touch the live app again:
- The endpoint path and HTTP method.
- Authentication scheme — Bearer token, signed request, or OAuth refresh flow. Note the TTL (time-to-live, i.e. how long the token stays valid).
- Request signing — many apps sign each request with an HMAC (a checksum keyed by a secret) of the body plus a shared secret. The secret is hidden in the app binary and usually survives across versions.
- Required headers —
X-App-Version,X-Device-ID,X-Build-Number. They look optional, but the API often returns 403 (forbidden) without them. - Pagination model — how it walks through pages: offset/limit, cursor, or token. Cursor-based pagination from a mobile API is almost always more reliable than offset-based on the web.
- Rate limit — fire 20 requests quickly and watch for a 429 (too many requests) or a rate-limit header. Mobile APIs often have looser limits than the web equivalent.
Then write the scraper against this documentation, not against the live app. Rotating X-Device-ID per worker, refreshing the auth token before it expires, and honouring the request-signing scheme is enough for most production cases.
Why mobile APIs are softer than the web
Three structural reasons:
- Mobile apps already authenticate. The app ships with an API key or signs requests with a per-user token, so the backend trusts those requests more than anonymous hits from a browser. More trust means lighter bot defences.
- Anti-bot vendors target browsers. Cloudflare, Akamai, and DataDome built their products to catch headless Chrome and Selenium. Traffic from a real device already looks like a real device — there is no equivalent product going after native HTTP clients at scale.
- JS rendering is irrelevant. Mobile APIs return JSON, not HTML. With no DOM there are no hidden honeypot fields and no client-side challenge to trip — the entire browser-fingerprinting category simply doesn't apply.
For example, a retailer's mobile app may hit a direct GraphQL endpoint served by a different backend than the web frontend, which is why the mobile and web paths can carry different anti-bot configurations even when they return the same data.
When mobile API scraping does not work
SSL pinning. Some apps lock onto their own SSL certificate and refuse to talk to an inspection proxy's certificate. Banking apps and high-value retailers commonly pin, and for those apps traffic inspection is generally not possible without authorization from the app owner.
Jailbreak detection. Some apps crash on rooted devices. SafetyNet Attestation — Google's check that a device isn't tampered with — is the usual mechanism; Magisk Hide / DenyList can usually work around it.
ARM-only apps. The default AVD runs on x86 chips. Some apps refuse to run on x86 emulators. Either use an arm64 emulator (slower) or a physical device with frida-server installed.
Tokens expire. Most apps issue fresh tokens at login. Build a token-refresh step into your scraper rather than relying on a single captured token.
