Proxies

What Is a DNS Leak in Web Scraping?

What Is a DNS Leak in <a href=
On this page

A DNS leak is when a client resolves hostnames through its own DNS resolver instead of through the proxy, revealing the real network behind the proxy. Even with a working proxy carrying your HTTP traffic, if the DNS lookup for the target hostname goes out over your real connection, your ISP and the DNS server see exactly which sites you are visiting - and the resolver's location can betray your true region. The classic cause is using socks5:// (client-side resolution) where you meant socks5h:// (resolution inside the tunnel).

Quick facts

What leaksThe hostname lookup, sent to your real DNS resolver outside the proxy
socks5 vs socks5hsocks5 resolves DNS client-side (leaks); socks5h resolves at the proxy (safe)
Why it mattersReveals target hosts and a resolver geolocation that can mismatch the exit IP
Also affectsWebRTC and QUIC paths can do their own lookups outside the proxy
FixForce tunnel-side resolution, or run a controlled resolver bound to the proxy

How the leak happens

When a scraper requests https://example.com through a proxy, two things must travel through the tunnel: the TCP/TLS connection and the DNS lookup that turns example.com into an IP. With an HTTP proxy or a socks5h:// proxy, the client hands the hostname to the proxy and the proxy resolves it - nothing leaks. With a plain socks5:// proxy, many clients resolve the hostname locally first and then ask the proxy to connect to the resulting IP. That local lookup goes to your machine's configured DNS resolver, over your real connection.

The same problem appears with system-level proxying that does not cover UDP, with split-tunnel VPN configs, and with libraries that have a separate DNS path from their HTTP path. The HTTP request looks perfectly proxied while the DNS query quietly exits the real interface.

Why a DNS leak deanonymizes a scraper

Two distinct harms. First, disclosure: whoever runs your DNS resolver (your ISP, a public resolver, or a corporate network) now has a log of every hostname you scraped, even though the page content went through the proxy. For anyone relying on the proxy for separation, that defeats the purpose.

Second, and more relevant to anti-bot detection, geo incoherence: the resolver that performed the lookup has its own geolocation. If your proxy exit is in Brazil but your DNS resolver is a German ISP, an observer correlating the authoritative DNS query location with the connection can see the mismatch. This compounds the timezone/IP mismatch family of signals: the story your network tells stops being internally consistent.

Closing the leak

The fixes, in order of preference:

  1. Use socks5h:// not socks5:// - the h forces hostname resolution at the proxy. This one-character change fixes the most common leak in curl, Python requests/httpx, and most scraping stacks.
  2. Use an HTTP/HTTPS proxy - HTTP proxies always receive the hostname (in the CONNECT line), so resolution happens proxy-side by design.
  3. Run a controlled local resolver bound to the tunnel, so even client-side lookups go through the proxy. Some anti-detect browsers ship a built-in resolver for exactly this.
  4. Contain UDP - QUIC/HTTP3 and WebRTC can perform their own out-of-band lookups; disable them or tunnel UDP (SOCKS5 UDP ASSOCIATE) so nothing escapes (see WebRTC leaks).

Verify with a DNS-leak test that reports which resolver answered. If the resolver country matches your proxy exit, the tunnel is clean; if it matches your real ISP, you are leaking.

Code example

bash
# The one-character fix: socks5h instead of socks5

# LEAKS - curl resolves example.com locally, then proxies the IP
curl --proxy socks5://user:pass@proxy:1080 https://example.com

# SAFE - the 'h' forces resolution inside the proxy tunnel
curl --proxy socks5h://user:pass@proxy:1080 https://example.com

# Python requests / httpx - same rule
#   proxies={'https': 'socks5h://user:pass@proxy:1080'}   # safe
#   proxies={'https': 'socks5://user:pass@proxy:1080'}    # leaks DNS

# Verify which resolver answered (country should match the proxy exit):
curl --proxy socks5h://user:pass@proxy:1080 https://dnsleaktest.example/api

Related terms

Concept map

How DNS Leak connects

The terms most directly tied to this one. Hover a node to see its neighbours, click to preview, drag to rearrange.

0 terms · 0 connections
You are here · Proxies
Building map…

Frequently asked questions

What is the difference between socks5 and socks5h?

With socks5 the client resolves the hostname to an IP locally and asks the proxy to connect to that IP - the DNS query leaks to your real resolver. With socks5h the client sends the hostname to the proxy and the proxy resolves it, so nothing leaks. For scraping behind a proxy, almost always use socks5h.

Does an HTTP proxy leak DNS like socks5 does?

No. HTTP and HTTPS proxies receive the target hostname directly (in the request line or the CONNECT request), so the proxy performs the resolution by design. The leak is specific to SOCKS5 clients that resolve locally, plus side channels like WebRTC and QUIC.

Can a DNS leak get my scraper blocked, or just expose me?

Both. The immediate harm is disclosure - your resolver sees the hostnames. For anti-bot detection, the leak can create a geolocation mismatch between the DNS resolver and the proxy exit IP, which adds to the timezone/IP coherence signals that anti-bot systems already score.

Last updated: 2026-05-30