Why JA3 broke and JA4 was needed
JA3 built its hash from the TLS Client Hello by concatenating the version, cipher suites, extensions, elliptic curves, and curve formats in the order the client sent them, then MD5-hashing the result. That order was stable per client for years, which made JA3 an excellent blocklist key.
In Chrome 110 (early 2023) Google shipped TLS extension-order randomisation as an anti-ossification measure: Chrome shuffles the order of its extensions on every connection. Overnight, a single Chrome install started producing a different JA3 hash on nearly every request — effectively billions of values. Blocklisting a JA3 became pointless, and ironically a scraper that sent a fixed extension order now stood out against real Chrome’s shuffling.
JA4 solves this by sorting the cipher and extension lists before hashing. Order no longer matters, so Chrome’s randomisation collapses back to a single stable JA4. The trade-off — throwing away order information — is recovered elsewhere in the JA4+ suite.
How a JA4 fingerprint is built
Unlike JA3’s single opaque MD5, JA4 is deliberately human-readable in three parts, joined by underscores:
- Prefix (a/b/c). Protocol (
tfor TCP,qfor QUIC), TLS version (13= 1.3), SNI present (d= domain,i= IP), two-digit cipher count, two-digit extension count, and the first ALPN value (h2for HTTP/2). Example:t13d1516h2. - Hash B. The first 12 hex chars of a SHA-256 over the sorted cipher-suite list.
- Hash C. The first 12 hex chars of a SHA-256 over the sorted extension list plus the signature algorithms.
The result looks like t13d1516h2_8daaf6152771_b186095e22b6. An analyst can read the prefix at a glance — TLS 1.3, 16 ciphers, 15 extensions, HTTP/2 — without decoding anything, and the two hashes serve as the precise match key.
The JA4+ suite — fingerprinting the whole connection
JA4 alone fingerprints the TLS Client Hello. The power of the approach is the + suite that fingerprints other layers and is cross-checked for coherence:
- JA4S — the server’s response (cipher chosen, extensions).
- JA4H — the HTTP layer: method, version, header order, cookie and referer presence, accept-language. This is what catches a client that nails the TLS JA4 but sends Python-shaped headers.
- JA4L — latency / round-trip timing, used to estimate physical distance and spot proxy hops.
- JA4X — X.509 certificate fingerprint.
- JA4SSH — SSH session fingerprint.
A scraper is scored on all relevant members at once. Matching Chrome’s JA4 while failing JA4H is the single most common tell — it happens whenever a library wraps a Chrome-impersonating TLS stack around its own HTTP implementation.
What it means for scrapers
The Python ssl / requests stack produces a JA4 that no browser ever sends — an instant block at any JA4-aware vendor. The fix is a TLS library that impersonates a real browser’s Client Hello byte-for-byte: curl_cffi (libcurl + BoringSSL with Chrome presets), tls-client, or rustls-based impersonators. These reproduce the cipher list, extensions, supported groups, and ALPN so the sorted JA4 matches a current Chrome.
Two pitfalls remain. First, cross-layer coherence — the JA4H must also match, which most HTTP clients get wrong (header order, casing, pseudo-header order). Second, version drift — a JA4 frozen to Chrome 120 while live Chrome is on 133 becomes its own anomaly, so impersonation presets need updating. The most reliable answer is to drive a real browser engine, or a build that maintains current presets, rather than hand-assembling a Client Hello. This is the same coherence problem clustering exploits, pushed down to the network layer.
