The shape of the bottleneck
Web scraping is almost always I/O-bound, meaning the time is spent waiting on the network, not crunching numbers. A single request takes anywhere from 200ms to 30s of real elapsed time, but the actual CPU work on your machine is just milliseconds. Synchronous code wastes all the rest of that time sitting idle; asynchronous code starts another request during the wait. The math is stark: for 1,000 URLs at 1 second each, sync takes 1,000 seconds, while async with 50 concurrent workers (50 requests in flight at once) finishes in about 20 seconds.
Where async stops helping
Concurrency always runs into a ceiling somewhere - your proxy pool, the target site's per-IP rate limit, or your scraping API's per-account throughput cap. Once you hit any of these, adding more concurrency just makes requests pile up in a queue without finishing any faster. The number to watch is throughput (URLs actually completed per minute), not how many requests you launch at once. Measure it directly. If 50 workers produce the same throughput as 200, the bottleneck has moved off your machine and onto one of those external limits.
Practical recommendations
For under 100 URLs, just write synchronous code - it is easier to debug and easier to retry the odd failure by hand. For 100 to 10,000 URLs, use async with a modest concurrency cap (10-50). Above 10,000 URLs, switch to a managed scraping API that handles concurrency, retries, and dead-letter queues (a holding spot for requests that keep failing) for you. Building that layer yourself is more work than it looks.
