Where Playwright fits in scraping
Playwright is the right tool when the data is rendered client-side and an HTTP client can't reach it: single-page apps that fetch via XHR after first paint, infinite-scroll lists, OAuth login flows, anything that requires real DOM events. It runs ~200MB of RAM per browser context — far heavier than curl_cffi — so use it only when the lighter approach doesn't work.
The Python API is the most common in scraping. async_playwright integrates with asyncio cleanly, and scrapy-playwright wraps it as a Scrapy downloader middleware for crawls that need browser rendering only on specific pages. The Node.js version is the original and slightly ahead on features but the Python one is feature-stable enough to match.
Why default Playwright gets blocked
Vanilla Playwright is detected on multiple surfaces simultaneously:
navigator.webdriver === true— the most-checked flag, set by Playwright and Selenium alike.- CDP connection signal — anti-bot scripts probe for
window.cdc_properties and Runtime.evaluate timing artifacts. - Headless mode tells — missing chrome.runtime, missing plugins, languages array of length 1, no permissions API.
- Function.toString() inspection — any stealth plugin that patches methods at the JS level fails this check (see the toString inspection entry).
- Default Playwright User-Agent includes "HeadlessChrome" unless explicitly overridden.
Setting headless: false and overriding the User-Agent removes the cheap detections but the CDP signal and toString inspection still fire. Production stealth requires a patched fork rather than runtime configuration.
Playwright vs Puppeteer vs Selenium
Picking between the three:
- Playwright — multi-browser, multi-language, modern auto-wait API. Default choice for new scrapers in Python or Node. Fastest learning curve.
- Puppeteer — Node-only, Chromium-only. Smaller API surface, mature ecosystem, slightly faster startup. Pick if you're Node-only and don't need Firefox/WebKit.
- Selenium — widest browser support (Safari, Edge, even mobile WebDriver), oldest API. Pick if you need Safari testing or have an existing Selenium codebase. Most detectable of the three.
All three are equally easy to detect on a default install. The patched variants exist for Playwright/Puppeteer (Camoufox, PatchRight, undetected-chromedriver, SeleniumBase UC), so the stealth ecosystem is the practical tiebreaker.
