What DataDome is
DataDome is a reverse-proxy WAF that runs at the application server, not at the CDN edge. Every request is forwarded synchronously to DataDome's scoring service, which returns a verdict in roughly 2 ms. The scorer is per-customer — around 85,000 ML models, one per protected site — so the same TLS, browser and proxy combination can pass on one DataDome customer and fail on another.
Low-trust requests surface as one of:
- A silent 403 with the
x-datadomeheader set. - A GeeTest-style slider captcha served inline.
- A block page with a
Reference #.
The four signal categories
1. IP address reputation
IP reputation accounts for roughly 25–30% of the score on its own — the heaviest single input.
- Datacenter IPs (AWS, GCP, Azure, DigitalOcean, OVH…) — pre-scored low. DataDome maintains one of the more accurate datacenter-range databases in the industry; many of these ranges are blanket-blocked on Etsy and Leboncoin before any other check runs.
- Residential IPs — assigned by ISPs to home connections, higher baseline trust.
- Mobile IPs — cell tower and CGNAT pools, highest baseline trust.
2. The WASM boring_challenge and the datadome cookie
DataDome's signature component is the WASM boring_challenge — a Rust-compiled state machine served as WebAssembly and executed in the browser. It produces a token that's POSTed to js.datadome.co, which then sets the datadome cookie that authorizes future requests.
Because the challenge is real WASM running against real browser APIs, it can't be solved without an actual browser execution context. The challenge also probes the CPU via SIMD timing in a way that exposes headless environments no stealth-browser JS patch covers. The sensor itself collects the usual fingerprint surface (canvas, WebGL, audio, fonts, screen metrics, timezone, navigator.webdriver, window.chrome) and feeds it into the WASM state.
3. HTTP and TLS fingerprinting
DataDome is one of the few WAFs that publicly markets HTTP/2 fingerprinting as a detection layer.
- Most scraping libraries still default to HTTP/1.1. Real Chrome and Firefox haven't in years.
libcurland Go'snet/httpproduce JA3 signatures that don't match any real browser, even when they negotiate HTTP/2.- HTTP/2 fingerprinting tracks pseudo-header order, SETTINGS frame values, and window-update sizes.
4. Behavioural and pattern analysis
DataDome runs continuous ML pattern analysis on connection history:
- The
datadomecookie sent from a different IP than the one that minted it. - Reused sensor payloads across pages instead of fresh ones per navigation.
- Honeypot link hits.
- Bursty request timing.
- Missing real-browser headers (
Sec-Fetch-*,Accept-Language,sec-ch-ua).
What this means for developers
The per-site model architecture means there is no single "DataDome solution" — a setup that works on a news customer may fail on an e-commerce one with stricter scoring. Three patterns are common in production:
- Look in the initial HTML first. Many DataDome-protected Next.js sites embed full page state in a
__NEXT_DATA__script tag. If the data is in the first HTML response, the WASM challenge never runs because there is no XHR to gate.curl_cffi+ a residential proxy is sufficient for those cases. - Mobile or ISP residential proxies for XHR endpoints — IP weighting is so heavy that switching from datacenter to mobile-4G frequently flips a session from blocked to 200 OK with no other change.
- Real browser execution when the page actually runs the WASM challenge — Camoufox with aligned IP/timezone/locale, or a managed scraping API.
For reference, a minimal managed-API example:
import requests
response = requests.post(
'https://publisher.scrappey.com/api/v1',
json={
'cmd': 'request.get',
'url': 'https://example.com/search?q=...',
'session': 'dd-session-1'
},
headers={'Authorization': 'Bearer YOUR_API_KEY'}
)
print(response.json()['solution']['response'])
DataDome is particularly sensitive to IP/cookie mismatches — the datadome cookie minted on one IP is treated with suspicion when sent from another, so a stable exit IP per session matters.
Sites commonly fronted by DataDome
E-commerce, classifieds, news and travel dominate: Etsy.com, Hermes.com, Leboncoin.fr, Marketwatch.com, Reuters.com, Tripadvisor.com, WSJ.com, Wellfound.com. Many of these rotate between DataDome, Cloudflare, Akamai and PerimeterX depending on conditions.
Summary
DataDome scores each request in ~2 ms against a per-site ML model using IP reputation (25–30% of the score), the WASM boring_challenge and datadome cookie, TLS and HTTP/2 fingerprints, and behavioural patterns. The per-customer architecture means detection behaviour varies between sites even when the underlying signals don't, which is the main reason setups that work on one DataDome target may not generalise to another.
