What Is a Computer Use Agent (CUA)?

By the Scrappey Research Team

Paste into ChatGPT, Claude, or any LLM

What Is a Computer Use Agent (CUA)? — conceptual illustration

On this page

A Computer Use Agent (CUA) is an AI agent that acts like a person at a keyboard: it logs into a portal as the user, clicks through the screens, deals with MFA (multi-factor login codes) and CAPTCHAs, and hands back clean, structured data. It differs from web scraping in three ways: the user gives permission first (so there's no terms-of-service conflict), it only touches data the user already owns, and it works on sites that have no public API to call. Anthropic's Computer Use, OpenAI's Operator, Skyvern (85.8% WebVoyager), and Browser Use (89% WebVoyager, the leading open-source option at 90K+ GitHub stars) are the current production-grade implementations.

WebVoyager — Browser Use	89% (open-source, leading)
WebVoyager — OpenAI Operator/CUA	87% (controlled VM environments)
WebVoyager — Skyvern	85.8%
WebVoyager — Anthropic Computer Use	56% (real desktop environments)
Browser Use stars	90K+

CUA vs web scraping — different categories

Classic web scraping sends anonymous HTTP or browser requests to public pages. You only get what a logged-out visitor would see. You pay per request, you can run many at once, and responses come back fast.

A Computer Use Agent works the opposite way: the user grants it permission, and it logs in as that user. You pay per task, you can run only a few at a time (each task needs its own VM — a throwaway virtual machine), and each task is slower (30 seconds to several minutes). The legal picture is clean because the data belongs to the user.

A useful mental model: CUAs are "Plaid for any website" — they bring the open-banking pattern (the user gives permission, then data is pulled out in a structured form) to portals that offer no public API. Think utility bills, bank statements, payroll exports, insurance claims, tax filings, or e-commerce backend orders.

When each one wins

Use a CUA when: the data sits behind a login the user owns; the portal has no API; the job needs MFA, step-up authentication (an extra identity check mid-session), or human-style clicking around; or you need occasional, small-scale retrievals (say 5 documents each for 200 users).

Use traditional scraping when: the data is public (e-commerce listings, search results, social media, news, real estate); you need fast responses (under a second); you need to run many requests in parallel (100+ at once); or cost per request matters (scraping is 10–100× cheaper for the same data when both approaches work).

For 100k items, scraping might cost €20–€100 on Scrappey. Running 100k CUA tasks could cost $5,000–$100,000 depending on the platform. That cost gap is exactly why these are two different tools, not rivals.

The current CUA landscape

Anthropic Computer Use — a direct API that drives the real host machine with raw mouse and keyboard actions. Best for building custom agent pipelines. It scores 56% on WebVoyager (a benchmark of real web tasks) because it operates full desktops with all their mess, not stripped-down browser-only VMs.

OpenAI Operator (CUA) — a hosted product with browser control built in; it scores 87% on WebVoyager in controlled environments.

Skyvern — open-source (YC-backed) and driven by a Vision-LLM (a model that reads the screen as an image). It scores 85.8% on WebVoyager and is strong at invoice retrieval, job applications, government forms, and insurance quotes. Available both cloud-hosted and self-hostable.

Browser Use — the leading open-source browser-only agent at 89% WebVoyager, with 90K+ GitHub stars. Plug in any LLM and run it locally or self-hosted. It supports OpenAI, Anthropic, Gemini, and Ollama for local models.

Deck — managed VMs with a credential vault and SOC 2 compliance, positioned as "Plaid for any website" with 100k+ utility provider integrations.

Code example

python

# Browser Use (open source, 89% WebVoyager) — the standard open-source CUA
# pip install browser-use

from browser_use import Agent, ChatOpenAI

agent = Agent(
    task="Log into example-utility.com using the credentials in the env, "
         "navigate to billing history, download the last 12 months of "
         "statements as PDFs, and return the file paths.",
    llm=ChatOpenAI(model="gpt-4o"),    # or ChatAnthropic, Gemini, local Ollama
)

result = agent.run()
# Returns structured output of the task — file paths, total billed, dates.

Related terms

What Is AI Web Scraping?

AI web scraping is an approach that replaces CSS selectors with natural-language prompts, LLM-based extraction, and Markdown-first output. N…

What Is a Web Scraping API?

A web scraping API is a hosted HTTP service that visits a web page for you and hands back the result — rendered HTML, JSON, or already-parse…

What Is a Headless Browser?

A headless browser is a real web browser — Chrome, Firefox, or WebKit — that runs without a visible window, driven entirely by code instead …

What Is Firecrawl?

Firecrawl is a web-scraping API built for AI: you hand it a URL and it hands back clean Markdown or JSON — no CSS selectors, no XPath, no HT…

How to Scrape Infinite-Scroll Pages

Infinite scroll is the page design where new content keeps loading on its own as you scroll down (like a social feed that never ends). To sc…

What Is jsoup?

jsoup is a free Java library that reads HTML and lets you pull data out of it. You give it a web page, and it turns the raw HTML into a DOM …

What Are Claude Skills?

Claude Skills are reusable capability packages - a folder containing a SKILL.md file plus optional scripts and reference files - that Claude…

What Are AI Agent Tools?

AI agent tools are the callable functions an autonomous LLM agent uses to act on the world - searching, fetching web pages, running code, qu…

What Is llms.txt?

llms.txt is a proposed web standard - a Markdown file published at a site's root (/llms.txt) that gives large language models a curated, cle…

Web Scraping for LLMs and RAG

Web scraping for LLMs is the process of fetching web pages and converting them into clean, chunkable text (usually Markdown) that can be emb…

Concept map

How Computer Use Agent connects

The terms most directly tied to this one. Hover a node to see its neighbours, click to preview, drag to rearrange.

0 terms · 0 connections

You are here · Web Scraping APIs

Tools & solutions for this topic

Frequently asked questions

Is a CUA the same as a headless browser?

No — they sit at different layers. A headless browser is just Chrome or Firefox running with no visible window; it's the engine that loads pages. A CUA is an AI agent layered on top of that browser (or sometimes a real desktop): it reads the page (visually or through the DOM, the page's element structure), decides the next move, and acts. The headless browser is the body; the CUA is the brain.

Why is Anthropic's Computer Use score lower than OpenAI's?

Because they're tested on different difficulty levels. WebVoyager measures browser-only tasks in controlled environments. OpenAI Operator runs in optimised browser-only VMs and scores 87%. Anthropic's Computer Use is more general — it can drive any desktop application, not just a browser — and was benchmarked in the harder real-desktop setting, scoring 56%. They solve overlapping but not identical problems, so the numbers aren't apples-to-apples.

Should I use Browser Use or Skyvern?

Pick Browser Use if you want the highest open-source WebVoyager score and the most active community (89%, 90K+ stars). Pick Skyvern if you specifically want a Vision-LLM driven agent that works from screenshots instead of the DOM (85.8%) — handy when the DOM is dynamically obfuscated (deliberately scrambled to block scrapers). For invoice retrieval and form-filling in particular, Skyvern has more documented production deployments.

When is a CUA cheaper than a managed scraping API?

Almost never for public data. CUAs bill per task ($0.05–$1 each); managed scraping APIs bill per request ($0.0002–$0.003 each). That CUA premium pays for the user-consent flow, MFA handling, and access to login-required data — none of which you need for public data. Use CUAs for portals behind a user login, and scraping APIs for everything else.

Last updated: 2026-05-31 · Facts last verified: 2026-06-16