How datacenter IPs are detected
Three checks fire before the page even loads:
- ASN lookup at the CDN edge. An ASN is the ID for a block of IPs owned by one operator; the CDN edge is the server that handles your request first, before it reaches the real site. Major anti-bot vendors keep comprehensive lists of hosting-provider ASNs (AS16509 AWS, AS15169 GCP, AS8075 Azure, AS14061 DigitalOcean, plus OVH, Hetzner, Linode, Vultr, etc.). AWS WAF specifically maintains the
HostingProviderIPListwith ASN-based inclusion. Match → blocked. - Published cloud subnets. A subnet is just a chunk of IP addresses. AWS, GCP, and Azure publish their public IP ranges as official JSON feeds. Anti-bot vendors pull these in directly and update their blocklists in near real time.
- Reverse-DNS pattern matching. Reverse DNS turns an IP back into a hostname. Many datacenter IPs answer with names like
ec2-54-83-...orcompute-1.amazonaws.com. Even if a request clears the ASN check, that hostname still gives the IP away.
One widely-cited figure: roughly 99% of traffic from known datacenter ranges is bot traffic. So a site can block on ASN knowing that the false-positive rate (a real user who happens to come from AWS) is effectively zero.
When datacenter proxies are actually fine
Datacenter is the right tool when:
- The target has no anti-bot at all. Public APIs, government open data, academic sites, and large static-content sites that simply do not care about scraping.
- The target is your own infrastructure. Checking your own production endpoints from another region, geofence testing, or load testing — datacenter works because your own systems do not block it.
- You can authenticate. Once you hold an API token, the ASN check usually falls away because authenticated requests are trusted differently. Datacenter is fine for authenticated API integrations.
- You are testing. Burning cheap IPs to size up a target before you commit to a residential budget.
Anywhere else — behind any real anti-bot system — datacenter gets blocked almost instantly, no matter how perfect your TLS fingerprint (TLS is the encryption layer behind https, and its handshake leaves a recognisable signature) is.
The proxy ladder by trust
Proxy types form a ladder from cheapest/least trusted up to most trusted/most expensive:
- Datacenter — ~$0.50–$1.50/GB. Unprotected targets only.
- ISP / static residential — ~$1.50–5/IP/month. Datacenter hardware that is announced to the internet under residential ASNs, so it looks like a home connection. Multi-request trust scoring rewards them.
- Residential — ~$3–10/GB. Peer-to-peer networks of real consumer devices. The default choice for general anti-bot work.
- Mobile / 4G–5G — ~$10–15/GB. Real carrier IPs sitting behind carrier-grade NAT (many phones share one IP, so blocking it would hit innocent users), which earns the highest trust score. For the hardest anti-bot targets.
Rule of thumb: match proxy cost to target difficulty. Spending mobile-tier money on an unprotected academic dataset is waste; using datacenter on a protected retail site is also waste — every request fails.
