Why JA3 broke and JA4 was needed
Every TLS connection starts with a Client Hello — the opening message where the browser lists what it supports. JA3 built its hash by taking five things from that message (the TLS version, cipher suites, extensions, elliptic curves, and curve formats) and stringing them together in the exact order the client sent them, then running MD5 over the result (MD5 is just a function that turns any input into a fixed-length code). For years a given browser always sent these in the same order, so JA3 made an excellent key for blocklisting.
That broke in Chrome 110 (early 2023), when Google shipped TLS extension-order randomisation. The goal was anti-ossification — stopping middleware on the internet from hard-coding assumptions about Chrome's traffic — so Chrome now shuffles the order of its extensions on every connection. Overnight, a single Chrome install began producing a different JA3 hash on nearly every request, effectively billions of values. Blocklisting a JA3 became pointless. Worse, the tables turned: a scraper that always sent a fixed extension order now stuck out, because real Chrome was constantly shuffling.
JA4 solves this by sorting the cipher and extension lists before hashing. Once everything is sorted, order no longer matters, so Chrome's randomisation collapses back to a single stable JA4. The cost of throwing away order information is paid back elsewhere in the JA4+ suite.
How a JA4 fingerprint is built
JA3 was one opaque MD5 string you could not read. JA4 is deliberately human-readable in three parts, joined by underscores:
- Prefix (a/b/c). A readable summary: protocol (
tfor TCP,qfor QUIC), TLS version (13= 1.3), whether a hostname was sent (d= domain,i= IP), a two-digit count of ciphers, a two-digit count of extensions, and the first ALPN value (the protocol the client wants to speak —h2means HTTP/2). Example:t13d1516h2. - Hash B. The first 12 hex characters of a SHA-256 hash over the sorted cipher-suite list.
- Hash C. The first 12 hex characters of a SHA-256 hash over the sorted extension list plus the signature algorithms.
The full value looks like t13d1516h2_8daaf6152771_b186095e22b6. An analyst can read the prefix at a glance — TLS 1.3, 16 ciphers, 15 extensions, HTTP/2 — without decoding anything, while the two hashes act as the precise match key.
The JA4+ suite — fingerprinting the whole connection
JA4 on its own only fingerprints the TLS Client Hello. The real strength comes from the + suite, a set of related fingerprints for other layers of the connection that are cross-checked against each other for consistency:
- JA4S — the server's response (which cipher it picked, which extensions).
- JA4H — the HTTP layer: request method, version, the order of headers, whether a cookie and referer are present, accept-language. This is what catches a client that gets the TLS JA4 right but sends Python-shaped HTTP headers.
- JA4L — latency, meaning round-trip timing, used to estimate physical distance and spot proxy hops in between.
- JA4X — a fingerprint of the X.509 certificate (the document that proves a server's identity).
- JA4SSH — a fingerprint of an SSH session.
A scraper is scored on all the relevant members at once. Matching Chrome's JA4 while failing JA4H is the single most common giveaway — it happens whenever a library wraps a Chrome-impersonating TLS stack around its own, non-Chrome HTTP implementation.
What it means for scrapers
The default Python ssl / requests stack produces a JA4 that no browser ever sends, which means an instant block at any JA4-aware vendor. The fix is a TLS library that copies a real browser's Client Hello byte-for-byte: curl_cffi (libcurl + BoringSSL with Chrome presets), tls-client, or rustls-based impersonators. These reproduce the exact cipher list, extensions, supported groups, and ALPN so that, once sorted, the JA4 matches a current Chrome.
Two traps remain. First, cross-layer coherence: the JA4H has to match too, and most HTTP clients get this wrong (header order, capitalisation, the order of pseudo-headers). Second, version drift: a JA4 frozen to Chrome 120 while live Chrome is on 133 becomes an anomaly of its own, so impersonation presets need regular updating. The most reliable approach is to drive a real browser engine, or use a tool that keeps its presets current, rather than hand-building a Client Hello. It is the same coherence problem clustering exploits, just pushed down to the network layer.
