The four layers of anti-bot detection
Modern bot-protection products check every request against four separate layers. Failing any one layer is usually enough to get blocked — the layers act like a row of gates, not a points total. You have to clear all of them or you get nothing through.
| Layer | What's inspected | Fires before… |
|---|---|---|
| 1. Network | TLS Client Hello (JA4), HTTP/2 SETTINGS frame, TCP options, IP reputation, ASN | HTML is served |
| 2. JavaScript | Canvas / WebGL / AudioContext fingerprints, navigator properties, Function.toString() inspection, extension probes | XHR / API calls fire |
| 3. WebAssembly | WASM SIMD CPU profile, SharedArrayBuffer timer precision, hyphenation dictionary checks | Challenge token is issued |
| 4. Behavioural | Mouse movement Bezier curves, scroll cadence, keypress timing, click-to-event latency | Score is finalised over multiple requests |
Each layer runs at a different moment. Layer 1 inspects the raw connection — including the TLS Client Hello, the first handshake message a browser sends, summarised as a JA4 fingerprint — before any HTML is even sent back. Layer 2 runs JavaScript in the page to probe the browser itself. Layer 3 leans on WebAssembly (compiled code that runs in the browser) for low-level CPU and timing checks. Layer 4 watches how you actually behave over several requests. So a scraper using curl_cffi (which only handles Layer 1) will pass against Layer 1-only vendors like older Imperva but fail against anything that loads sensor.js. A patched browser (Layers 1+2) will pass Akamai's static checks but fail DataDome's behavioural ML.
The five-vector coherence test
On top of the four detection layers, vendors run a separate identity-coherence check. The idea is simple: a real visitor's details should all tell the same story. These five vectors must agree:
- IP — geolocation, ASN type (residential / datacenter / mobile)
- Timezone —
Intl.DateTimeFormat().resolvedOptions().timeZone - Accept-Language — HTTP header
- WebRTC — candidate IP exposed by STUN/TURN
- DNS — resolver used (matches ISP or VPN?)
Here is what coherent looks like: an IP in São Paulo, a timezone of America/Sao_Paulo, an Accept-Language: pt-BR, a WebRTC candidate that matches the proxy, and a Brazilian ISP DNS resolver — every signal points to the same person in the same place. Now the giveaway: a US datacenter IP with a Tokyo timezone, English Accept-Language, and a WebRTC leak that reveals the operator's real home IP. That mismatch is the most common scraping signature and is trivially blocked. Proxy-rotation tools that change only the IP fail this test every time, because they leave the other four vectors pointing elsewhere.
