Web Scraping APIs

What Is mitmproxy?

What Is mitmproxy? — conceptual illustration
On this page

mitmproxy is a free tool that sits between an app and the internet so you can read and change the HTTPS traffic passing through it. The name comes from "man-in-the-middle": it acts as a proxy in the middle of the connection and decrypts the traffic, which is normally encrypted (HTTPS). Because it's scriptable in Python, you can also rewrite, log, or replay any request automatically. In scraping it's the go-to tool for figuring out what a site or app actually sends. You run it as a CLI (mitmproxy), a browser UI (mitmweb), or a headless engine (mitmdump), and it accepts inline Python scripts that can change any request while it's in flight. The first step of the scraping decision flow is "intercept the mobile app first" — and mitmproxy is how you do that.

Quick facts

Vendormitmproxy project (open-source, MIT)
LanguagePython (server); scripts in Python
ModesCLI (mitmproxy), web UI (mitmweb), headless replay (mitmdump)
Use case in scrapingMobile API discovery, request inspection, replay & rewrite
LimitationCertificate pinning — many apps refuse the mitmproxy CA on devices that enforce pinning

What mitmproxy is for

There are two main jobs it does in scraping:

  1. Mobile API discovery. Install mitmproxy's certificate (the credential a device trusts to verify HTTPS) on an Android emulator or jailbroken iPhone, point the device's proxy setting at mitmproxy, and use the target app normally. Every request becomes readable — the endpoints it calls, the auth tokens it sends, how it signs requests, how it pages through results. This is how scrapers find the unprotected mobile backends sitting behind sites that pay Akamai to protect their websites.
  2. Web request inspection and replay. When a scraper is misbehaving, route it through mitmproxy and re-send individual requests with tweaked headers (the r key opens a request editor). Using the inline Python scripting, you can rewrite requests on the fly without editing the scraper itself.

mitmweb (the browser UI) is the easiest for one-off use; mitmproxy (the keyboard-driven terminal UI) is faster once you learn it; mitmdump runs without a UI, which is handy in CI or scripted captures.

mitmproxy vs HTTP Toolkit vs Charles Proxy vs Burp Suite

Four tools cover the intercepting-proxy category, with overlapping use cases:

ToolBest forCost
mitmproxyCLI/scripting, automation, repeatable capturesFree
HTTP ToolkitGUI-driven mobile intercept; one-click device setupFree + Pro ($10/mo)
Charles ProxyVeteran GUI, polished macOS experience$50 one-time
Burp SuiteSecurity recon, intruder/repeater, MCP serverFree / Pro $475/yr

For scraping reconnaissance specifically, mitmproxy is the default — it's free, scriptable, and built squarely around the intercept-and-replay loop. Burp Suite can do the same things, but it's really a penetration-testing tool, and the price reflects that.

The certificate-pinning wall

Roughly half of mainstream mobile apps pin their TLS certificates — the app ships with the expected server certificate's fingerprint baked in and refuses to talk to anything else. That means mitmproxy's certificate, which you installed on the device, is rejected, and the app just shows a network error.

Three escalation steps when pinning blocks you:

  1. Try a different app version. Older versions of the same app often skip pinning. Sideload an APK (the Android install file) from a few releases back via apkpure or similar.
  2. Frida + certificate unpinning (for apps you are authorized to test). Frida is a tool that injects code into a running app. Running frida-server on the device plus fridantiroot.js on your machine switches off both okhttp3.CertificatePinner and the Java TrustManagerFactory — the two common pinning mechanisms. This works against most apps. See the mobile API scraping playbook for the full workflow.
  3. objection / static reverse engineering. When pinning is built into native code (banking apps, some games), Frida's default scripts aren't enough. objection handles more cases; truly custom pinning means disassembling the app by hand. By this point you're spending more effort on the intercept than the scraping is worth.

Code example

python
# inline mitmproxy script — extract auth tokens and pagination cursors
# save as tokens.py, run with: mitmproxy -s tokens.py
from mitmproxy import http
import json

class TokenExtractor:
    def __init__(self):
        self.tokens = {}

    def response(self, flow: http.HTTPFlow) -> None:
        # Capture bearer tokens from any login endpoint
        if "/login" in flow.request.path and flow.response.status_code == 200:
            try:
                body = json.loads(flow.response.text)
                if "access_token" in body:
                    self.tokens[flow.request.host] = body["access_token"]
                    print(f"captured token for {flow.request.host}")
            except json.JSONDecodeError:
                pass

        # Log cursor-based pagination for later reuse
        if "X-Next-Cursor" in flow.response.headers:
            print(f"{flow.request.path} cursor: {flow.response.headers['X-Next-Cursor']}")

addons = [TokenExtractor()]

Related terms

Concept map

How mitmproxy connects

The terms most directly tied to this one. Hover a node to see its neighbours, click to preview, drag to rearrange.

0 terms · 0 connections
You are here · Web Scraping APIs
Building map…

Frequently asked questions

Why use mitmproxy instead of Wireshark?

Wireshark only sniffs raw network traffic, so for HTTPS you just see encrypted bytes you can't read. mitmproxy terminates the TLS connection using its own certificate, so you see the actual plaintext request and response bodies. In short: Wireshark is for low-level network debugging; mitmproxy is for the HTTPS application traffic that scrapers actually care about.

Can mitmproxy intercept HTTP/3 / QUIC?

Not yet at production quality. There's an experimental HTTP/3 mode, but it lags behind the official spec. For QUIC-only services (some Google properties), you currently force the client to fall back to HTTP/2 using an upstream rule, then proxy that instead.

Is mitmproxy detectable by the server?

Mostly no. Because mitmproxy runs on your own local network, the server just sees a normal Chrome or mobile-app TLS handshake coming from your machine. It only finds out it's being intercepted if your client adds tell-tale headers (mitmproxy doesn't) or if the app itself reports it — some apps phone home with proxy-status flags, in which case you'd disable that telemetry.

Last updated: 2026-05-31