The shape of the bottleneck
Web scraping is almost always I/O-bound. A request takes 200ms-30s of wall time, of which the actual CPU work on your machine is milliseconds. Sync code wastes the rest waiting; async code issues another request during that wait. For 1,000 URLs at 1 second each, sync takes 1,000 seconds; async with 50 concurrent workers takes ~20 seconds. The arithmetic is unforgiving.
Where async stops helping
Concurrency is bounded by something — your proxy pool, the target's per-IP rate limit, the scraping API's per-account throughput. Once you hit any of those, adding more concurrency just queues requests. The right metric is throughput (URLs/minute completed), not concurrency. Measure it. If 50 workers gives the same throughput as 200, the bottleneck has moved off your machine.
Practical recommendations
For under 100 URLs, write sync code — easier to debug, easier to retry one-off failures by hand. For 100-10,000 URLs, use async with a small concurrency cap (10-50). For more than 10,000 URLs, switch to a managed scraping API that handles concurrency, retries, and dead-letter queues for you — building this layer is more work than it looks.
