Web Automation

Playwright vs Puppeteer

By the Scrappey Research Team

Playwright vs Puppeteer — conceptual illustration
On this page

Playwright and Puppeteer are both Node-based browser automation libraries that drive a real browser over the Chrome DevTools Protocol (CDP), but they differ in scope: Playwright targets Chromium, Firefox, and WebKit from JavaScript, Python, .NET, and Java with built-in auto-waiting, while Puppeteer is JavaScript/TypeScript-first and centered on Chrome/Chromium. Puppeteer was released by Google's Chrome DevTools team in 2017; Playwright followed in 2020 from Microsoft, built by several of the original Puppeteer engineers who wanted cross-browser parity and a less flaky waiting model. They share a common heritage and a near-identical mental model (browser, context, page), so code reads similarly, but Playwright is the broader framework and Puppeteer is the leaner, Chrome-focused library.

Quick facts

First release / maintainerPuppeteer 2017 (Google); Playwright 2020 (Microsoft)
LanguagesPlaywright: JS/TS, Python, .NET, Java; Puppeteer: JS/TS
BrowsersPlaywright: Chromium, Firefox, WebKit; Puppeteer: Chrome/Chromium, Firefox (BiDi)
Waiting modelPlaywright auto-waits on actionability; Puppeteer is mostly manual
Underlying protocolBoth use CDP for Chrome; Playwright patches Firefox/WebKit, Puppeteer uses WebDriver BiDi for Firefox

Shared roots, different scope

Both libraries automate a real browser by speaking the Chrome DevTools Protocol, a low-level command-and-event channel that lets you navigate, evaluate JavaScript, intercept network traffic, and read the DOM. Puppeteer shipped first in 2017 from Google's Chrome DevTools team and stayed deliberately focused: one language (JavaScript/TypeScript) driving Chrome and Chromium. Playwright arrived in 2020 from Microsoft, authored partly by ex-Puppeteer engineers, with a wider remit.

The practical differences in scope:

  • Language bindings. Playwright ships official, feature-matched clients for Node.js, Python, .NET (C#), and Java. Puppeteer is Node-only; community Python ports exist (for example Pyppeteer) but lag the upstream library.
  • Browser coverage. Playwright drives Chromium, Firefox, and WebKit (the engine behind Safari) from the same API, using Microsoft-maintained browser builds so behavior is consistent. Puppeteer is Chrome/Chromium-first and added Firefox support through the cross-browser WebDriver BiDi standard (stable from Puppeteer 23, the default Firefox protocol from 24). There is no WebKit/Safari engine in Puppeteer.
  • Tooling. Playwright bundles a test runner (@playwright/test), trace viewer, codegen, and assertions. Puppeteer is a pure automation library and leaves test orchestration to Jest, Mocha, or your own scripts.

Because the object model (browser to context to page) is so similar, porting a script in either direction is usually mechanical rather than a rewrite.

Auto-waiting, contexts, and parallelism

The biggest day-to-day difference is how each handles timing. Playwright auto-waits: before clicking, filling, or asserting, it checks that the element is attached, visible, stable, and enabled, retrying until an actionability timeout. That removes most hand-written waitForSelector and arbitrary setTimeout calls and is the main reason Playwright scripts tend to be less flaky on dynamic, JavaScript-heavy pages.

Puppeteer gives you more explicit control but expects you to do the waiting. You typically call page.waitForSelector(), page.waitForNavigation(), or page.waitForFunction() yourself. That is more verbose, though some teams prefer the predictability of stating every wait condition.

Both expose lightweight BrowserContext objects: isolated sessions inside one browser process, each with its own cookies, local storage, and cache. This is how you run many independent sessions cheaply (different logins, different proxies per context) without booting a separate browser per worker. Playwright leans into this with its parallel test runner and per-context tracing; Puppeteer offers the same browser.createBrowserContext() primitive but you wire up concurrency yourself. For large scraping runs, this context-per-job pattern is what keeps memory and startup cost manageable in both libraries.

Which to choose for scraping vs testing

Pick based on the job, not hype, because each genuinely wins in places.

  • Choose Playwright when you need cross-browser coverage (WebKit/Safari rendering matters), you work in Python/.NET/Java, you want auto-waiting to tame flaky dynamic pages, or you want a batteries-included test framework with tracing and codegen. For new scraping projects it is often the stronger default because multi-engine support and resilient waiting reduce maintenance.
  • Choose Puppeteer when you are Node-only and Chrome-only, want a smaller dependency surface, or need direct, fine-grained CDP access (it sits closer to the raw protocol and is a common base for AI-agent browser tooling). For simple single-browser Chrome scripts it is lean and fast to start.

Both are excellent at controlling a browser, but neither solves the operational side of large-scale web scraping: rotating residential proxies, rendering JavaScript at scale, managing realistic browser fingerprints, and retrying transient failures. You either build that infrastructure yourself or front your automation with it. A managed web-data API such as Scrappey handles proxies, a real headless browser, fingerprinting, and retries behind a single HTTP request, so you can keep Playwright or Puppeteer for local logic and offload the heavy lifting when a target needs it.

Code example

javascript
// Same task in both libraries: open a page, wait for content, scrape titles.

// --- Playwright (auto-waits; cross-browser by swapping the import) ---
import { chromium } from 'playwright'; // or: firefox, webkit

const browser = await chromium.launch({ headless: true });
const context = await browser.newContext({ locale: 'en-US' }); // isolated session
const page = await context.newPage();

await page.goto('https://example.com/products', { waitUntil: 'domcontentloaded' });
// No explicit wait needed: locator auto-waits until elements are present.
const titles = await page.locator('.product .title').allTextContents();
console.log(titles);

await browser.close();

// --- Puppeteer (Node-only; you wait explicitly) ---
import puppeteer from 'puppeteer';

const pBrowser = await puppeteer.launch({ headless: true });
const pPage = await pBrowser.newPage();

await pPage.goto('https://example.com/products', { waitUntil: 'domcontentloaded' });
await pPage.waitForSelector('.product .title'); // manual wait
const pTitles = await pPage.$eval('.product .title', els => els.map(e => e.textContent.trim()));
console.log(pTitles);

await pBrowser.close();

Related terms

Concept map

How Playwright vs Puppeteer connects

The terms most directly tied to this one. Hover a node to see its neighbours, click to preview, drag to rearrange.

0 terms · 0 connections
You are here · Web Automation
Building map…

Frequently asked questions

Do Playwright and Puppeteer use the same protocol under the hood?

Both drive Chrome and Chromium over the Chrome DevTools Protocol (CDP), which is why their APIs feel so similar. They diverge for other engines: Playwright ships its own patched Firefox and WebKit builds to keep CDP-style control consistent, while Puppeteer added Firefox support through the cross-browser WebDriver BiDi standard and does not support the WebKit/Safari engine at all.

Is Playwright just a fork of Puppeteer?

No, it is a separate codebase, but the lineage is real. Playwright was created at Microsoft in 2020 by several engineers who had previously worked on Puppeteer at Google, so it carries forward many of the same design ideas while adding multi-browser support, multiple language bindings, and built-in auto-waiting. The two are independent projects that happen to share a heritage and a similar object model.

Can I use Python with Puppeteer like I can with Playwright?

Not officially. Playwright ships a maintained Python client with the same features as its JavaScript version, alongside .NET and Java. Puppeteer is JavaScript and TypeScript only; community Python ports such as Pyppeteer exist but tend to trail the upstream library in features and updates, so for production Python work Playwright is usually the safer choice.

Which is better for web scraping in 2026?

For new scraping projects, Playwright is often the stronger default because of multi-browser support, native auto-waiting that reduces flakiness on dynamic pages, and first-class Python and .NET clients. Puppeteer remains a great fit when you are Node-only and Chrome-only or want a smaller, lower-level dependency. Neither handles proxies, fingerprinting, or large-scale retries on its own, so demanding targets often pair either library with a managed web-data API.

Last updated: 2026-06-16 · Facts last verified: 2026-06-16