

Paste or fetch a robots.txt file and test any URL against any user-agent. See parse warnings, grouped rules, sitemaps, and the exact rule that allowed or blocked your path.
Live parsing • Path tester • No registration
/robots.txt is the standard file at the root of a domain that tells crawlers which paths they're allowed to fetch and which they should leave alone. Search engines, AI training crawlers, and well-behaved scrapers all consult it before requesting pages.
A single misplaced Disallow: / can de-index an entire site overnight. This validator parses your file the same way Google does — grouping rules by User-agent, applying the longest-match-wins rule for Allow / Disallow, and supporting * and $ wildcards.
Test specific URLs against specific bots and see exactly which line decides the verdict.

Paste your robots.txt or fetch it from a live domain, then check any URL against any bot
Verification is only required when fetching from a URL — pasting content works without it.
Use Scrappey to fetch every site's robots.txt at scale, monitor changes, and validate that critical paths stay crawlable.
Match Google's behavior, catch mistakes before they hit production
Groups by User-agent, supports * and $ wildcards, applies longest-match-wins for Allow vs Disallow.
Pick a URL and a bot, see the exact rule that allows or blocks it — including which line and which group decided.
Catches missing colons, orphaned rules, bad paths, empty User-agents, and unknown directives before you ship.
Surfaces every Sitemap: directive with clickable links so you can sanity-check what crawlers will index.
Skip the CORS dance — fetch robots.txt from any public site and validate it in one click.
Preset list includes GPTBot, ChatGPT-User, CCBot, Claude-Web, anthropic-ai, and PerplexityBot for AI training audits.
Use Scrappey to fetch robots.txt across many domains at once
import requests
# Fetch a robots.txt via Scrappey (handles blocked domains too)
api = "https://publisher.scrappey.com/api/v1?key=YOUR_API_KEY"
domains = ["example.com", "another-site.com"]
for d in domains:
res = requests.post(api, json={
"cmd": "request.get",
"url": f"https://{d}/robots.txt"
}).json()
print(d, "→", res["solution"]["response"][:200])Confirm critical pages stay crawlable by Googlebot and Bingbot after deploys, redesigns, or staging swaps.
Decide which AI training crawlers (GPTBot, CCBot, etc.) to block — and verify your rules actually do it.
Check what your scraper is allowed to touch before you build it, so you stay within published crawl policy.
Catch the accidental "Disallow: /" before a launch — pipe staging robots.txt through this tool in CI.
Automate workflows visually. Streamline data collection processes.
Pre-built template for modern websites. Simplifies Scrappey integration.
Access via API marketplace. Easy integration with comprehensive docs.
Scalable actor-based automation. Reliable browser rendering.
AI-powered browser automation. Intelligent session management.
Scrape from your terminal. One command, pipeable output, CI-ready.
Portable skill for Claude Code + Codex. Browser-backed data access on demand.
LangChain connector — clean web data for any chain or agent.

LlamaIndex reader — load modern web pages straight into RAG.
Connect with 7,000+ apps. Automate workflows easily.
Visual workflow automation. Connect with 1,000+ apps easily.
Try It For Free. No Subscription Required. No Credit Card Required. Instant Set-Up. 150 Free Requests Are Waiting For You!
Scrappey.com is a web scraping API that handles all the complex aspects of web scraping, such as handling dynamic content, rotating proxies, advanced request handling, headless browsers, and verification processing. It offers an all-in-one solution for extracting publicly available data from websites.
Scrappey.com provides a web scraping API that allows you to send requests to extract publicly available data from websites. It handles dynamic content and modern website complexity, including rotating proxies, advanced request handling, and verification processing. You can easily extract publicly available data from websites using their built-in features like headless browsers and AI-powered data extraction.
Yes, with Scrappey.com, you have the option to use Sticky Rotating Proxies for seamless scraping. Alternatively, you can also set your own proxies if desired.
Yes, Scrappey.com offers a free trial where you can try it out without a subscription or credit card. Instant setup is provided, and you get 150 free scrapes to explore the capabilities of the platform.
We only charge for successful requests. Failed requests are not counted towards your usage, so you only pay for what works.
No problem, you can pass any JavaScript snippet that needs to be executed by using our JavaScript scenario parameter. This allows you to interact with dynamic content, scroll pages, click buttons, wait for elements, and perform any custom JavaScript actions before extracting the data.
Scrappey.com offers simple and transparent pricing: €0.20 per 1,000 direct HTTP requests and €1.00 per 1,000 full-browser requests. Residential proxies are included on both tiers — no separate proxy billing, no hidden fees, no complicated pricing tiers. You only pay for successful requests.
Scrappey.com provides scalable access for extracting publicly available data. Whether you need to extract data from a few pages or a large dataset of publicly accessible content, you can do so with flexible usage options. Please note that Scrappey.com only supports scraping publicly available data, and users must comply with applicable laws and website terms of service.
Scrappey.com provides various support channels for assistance. You can refer to their documentation, frequently asked questions section, blog, and uptime status page. Additionally, you can get in touch with them via email or join their Discord community for further support.
We don't create custom scraping scripts, however we will gladly write some code snippets helping you to use our most powerful features: AI-powered data extraction and JavaScript scenario. Our documentation includes examples in multiple programming languages to get you started quickly.
Each API call to Scrappey counts as one request. Our pricing is based on successful requests. By default, JavaScript rendering is enabled, which allows you to extract data from modern websites with dynamic content. All features including proxies, CAPTCHA solving, and advanced web access handling are included in each request.
Scrappey's API is optimized for fast response time, even when dealing with complex or protected websites. If other scrapers struggle with sites that have advanced security measures, Scrappey is designed to handle these challenges efficiently, ensuring reliable data retrieval. Our advanced web access handling, residential proxies, and intelligent retry logic work together to maximize success rates.