Web Scraping APIs

What Is the Chrome DevTools Protocol (CDP) in Web Scraping?

By the Scrappey Research Team

What Is the Chrome DevTools Protocol (CDP) in <a href=
On this page

The Chrome DevTools Protocol (CDP) is the low-level interface for instrumenting and controlling Chromium-based browsers. Low-level means it speaks directly to the browser's internals rather than through a convenience layer, so it is powerful but wordy. It is the same machinery your browser's built-in DevTools panel (F12) uses to inspect a page. Puppeteer, Playwright, and many stealth tools sit on top of CDP. For scraping it gives you fine-grained control: intercept network requests, override headers, run JavaScript inside the page, capture screenshots, and dump the DOM (the live structure of the page). Using CDP directly is verbose, but it reaches capabilities the higher-level libraries do not expose.

Quick facts

What it controlsAny Chromium browser (Chrome, Edge, Brave, Opera)
ConnectionWebSocket to chrome://inspect endpoint
Built on top byPuppeteer, Playwright, undetected-chromedriver, Camoufox
Direct use casesCustom interception, attach-to-existing-Chrome, browser-internal probes
DetectabilityCDP enables --remote-debugging-port; some sites detect this

Where CDP fits

Every library that controls a Chromium browser ultimately speaks CDP under the hood. Puppeteer and Playwright wrap it in friendlier APIs and add their own conveniences, such as auto-waiting (pausing until an element is ready) and selector engines (helpers for finding elements on the page). For about 95% of scraping you want the wrapper, not raw CDP. The main exception is when you need to attach to a real user's Chrome - a profile that already has cookies, history, and extensions installed - instead of launching a fresh headless instance (a browser with no visible window). In that case, talking to CDP directly through its WebSocket endpoint (the live two-way connection the browser opens for debugging) is the cleanest path.

Detection considerations

Chrome only opens the CDP port when you launch it with the --remote-debugging-port flag. Some defensive scripts probe for this and flag the session - but it is a weak signal, because the port is visible only to the host machine, not to the page itself. The stronger CDP-related giveaway is the Runtime.enable domain being active in the page context, which Puppeteer and Playwright switch on by default. (A domain is one feature area of CDP; Runtime.enable turns on JavaScript-execution hooks, and turning it on leaves traces a page can notice.) Some automation tools toggle these domains off when they are not needed.

When to use CDP directly

Three real cases call for raw CDP: (1) attaching to an existing Chrome process that uses a real profile, (2) building custom request interception that Playwright's API does not expose, and (3) building a tool that needs precise control over which CDP domains are enabled. For everything else, Playwright or Puppeteer is the better default.

Code example

python
import asyncio, json, websockets, requests

async def cdp_navigate(url):
    targets = requests.get('http://localhost:9222/json').json()
    ws_url = targets[0]['webSocketDebuggerUrl']
    async with websockets.connect(ws_url) as ws:
        await ws.send(json.dumps({
            'id': 1, 'method': 'Page.navigate', 'params': {'url': url}
        }))

asyncio.run(cdp_navigate('https://example.com'))

Related terms

Concept map

How Chrome DevTools Protocol (CDP) connects

The terms most directly tied to this one. Hover a node to see its neighbours, click to preview, drag to rearrange.

0 terms · 0 connections
You are here · Web Scraping APIs
Building map…

Frequently asked questions

Should I use CDP directly or Playwright?

Use Playwright unless you have a specific reason not to. Direct CDP is verbose, undocumented for many edge cases, and tends to break between Chrome versions. Playwright keeps that compatibility working for you.

Can sites detect CDP usage?

They cannot see the protocol itself, but they can detect its symptoms: side effects of Runtime.enable, a missing chrome.runtime.runtimeId value, and certain navigator probes (checks scripts run against the browser's navigator object). Well-configured automation tools account for most of those signals.

Does CDP work in Firefox?

Firefox implements a CDP-compatible subset so Playwright can drive it, but it lacks many domains (feature areas). For Firefox scraping (Camoufox is Firefox-based), the Playwright API is the cleaner interface.

Last updated: 2026-05-31