Glowing Web Network
Glowing Web Network

Robots.txt
Validator & Tester

Paste or fetch a robots.txt file and test any URL against any user-agent. See parse warnings, grouped rules, sitemaps, and the exact rule that allowed or blocked your path.

Live parsing • Path tester • No registration

What is robots.txt?

/robots.txt is the standard file at the root of a domain that tells crawlers which paths they're allowed to fetch and which they should leave alone. Search engines, AI training crawlers, and well-behaved scrapers all consult it before requesting pages.

A single misplaced Disallow: / can de-index an entire site overnight. This validator parses your file the same way Google does — grouping rules by User-agent, applying the longest-match-wins rule for Allow / Disallow, and supporting * and $ wildcards.

Test specific URLs against specific bots and see exactly which line decides the verdict.

Robots.txt Validator Visual

Validate & Test

Paste your robots.txt or fetch it from a live domain, then check any URL against any bot

Verification is only required when fetching from a URL — pasting content works without it.

Parse summary

Groups
2
Rules
3
Sitemaps
1
Issues
0

Test a URL against a bot

Allowed
No matching Allow/Disallow rule for "/admin/public/page" in this group. Default: allowed.
Matched group: User-agent: Googlebot

Parsed groups

*
L3Disallow: /admin/
L4Allow: /admin/public/
Googlebot
L7Disallow: /private/

Audit Robots.txt Across Your Whole Domain

Use Scrappey to fetch every site's robots.txt at scale, monitor changes, and validate that critical paths stay crawlable.

Get 150 Free Credits

Why Use Scrappey's Robots.txt Validator?

Match Google's behavior, catch mistakes before they hit production

Google-Spec Parsing

Groups by User-agent, supports * and $ wildcards, applies longest-match-wins for Allow vs Disallow.

Path Tester

Pick a URL and a bot, see the exact rule that allows or blocks it — including which line and which group decided.

Issue Detection

Catches missing colons, orphaned rules, bad paths, empty User-agents, and unknown directives before you ship.

Sitemap Discovery

Surfaces every Sitemap: directive with clickable links so you can sanity-check what crawlers will index.

Fetch Any Domain

Skip the CORS dance — fetch robots.txt from any public site and validate it in one click.

AI Bot Coverage

Preset list includes GPTBot, ChatGPT-User, CCBot, Claude-Web, anthropic-ai, and PerplexityBot for AI training audits.

Audit Robots.txt in Code

Use Scrappey to fetch robots.txt across many domains at once

import requests

# Fetch a robots.txt via Scrappey (handles blocked domains too)
api = "https://publisher.scrappey.com/api/v1?key=YOUR_API_KEY"
domains = ["example.com", "another-site.com"]

for d in domains:
    res = requests.post(api, json={
        "cmd": "request.get",
        "url": f"https://{d}/robots.txt"
    }).json()
    print(d, "→", res["solution"]["response"][:200])

Perfect For

SEO Audits

Confirm critical pages stay crawlable by Googlebot and Bingbot after deploys, redesigns, or staging swaps.

AI Bot Policy

Decide which AI training crawlers (GPTBot, CCBot, etc.) to block — and verify your rules actually do it.

Scraping Compliance

Check what your scraper is allowed to touch before you build it, so you stay within published crawl policy.

Pre-Launch QA

Catch the accidental "Disallow: /" before a launch — pipe staging robots.txt through this tool in CI.

footer-frame

Start building with Scrappey

Try It For Free. No Subscription Required. No Credit Card Required. Instant Set-Up. 150 Free Requests Are Waiting For You!

Frequently asked questions

What is Scrappey.com?

Scrappey.com is a web scraping API that handles all the complex aspects of web scraping, such as handling dynamic content, rotating proxies, advanced request handling, headless browsers, and verification processing. It offers an all-in-one solution for extracting publicly available data from websites.

How does Scrappey.com work?

Scrappey.com provides a web scraping API that allows you to send requests to extract publicly available data from websites. It handles dynamic content and modern website complexity, including rotating proxies, advanced request handling, and verification processing. You can easily extract publicly available data from websites using their built-in features like headless browsers and AI-powered data extraction.

Can I customize the proxies used for scraping?

Yes, with Scrappey.com, you have the option to use Sticky Rotating Proxies for seamless scraping. Alternatively, you can also set your own proxies if desired.

Is there a free trial available?

Yes, Scrappey.com offers a free trial where you can try it out without a subscription or credit card. Instant setup is provided, and you get 150 free scrapes to explore the capabilities of the platform.

What happens if a request fails?

We only charge for successful requests. Failed requests are not counted towards your usage, so you only pay for what works.

I need to scroll or click on a button on the page I want to scrape

No problem, you can pass any JavaScript snippet that needs to be executed by using our JavaScript scenario parameter. This allows you to interact with dynamic content, scroll pages, click buttons, wait for elements, and perform any custom JavaScript actions before extracting the data.

What is the pricing structure for Scrappey.com?

Scrappey.com offers simple and transparent pricing: €0.20 per 1,000 direct HTTP requests and €1.00 per 1,000 full-browser requests. Residential proxies are included on both tiers — no separate proxy billing, no hidden fees, no complicated pricing tiers. You only pay for successful requests.

Are there any usage restrictions or limitations?

Scrappey.com provides scalable access for extracting publicly available data. Whether you need to extract data from a few pages or a large dataset of publicly accessible content, you can do so with flexible usage options. Please note that Scrappey.com only supports scraping publicly available data, and users must comply with applicable laws and website terms of service.

What support channels are available?

Scrappey.com provides various support channels for assistance. You can refer to their documentation, frequently asked questions section, blog, and uptime status page. Additionally, you can get in touch with them via email or join their Discord community for further support.

I'm not a developer, can you create custom scraping scripts for me?

We don't create custom scraping scripts, however we will gladly write some code snippets helping you to use our most powerful features: AI-powered data extraction and JavaScript scenario. Our documentation includes examples in multiple programming languages to get you started quickly.

What is a request and how are they counted?

Each API call to Scrappey counts as one request. Our pricing is based on successful requests. By default, JavaScript rendering is enabled, which allows you to extract data from modern websites with dynamic content. All features including proxies, CAPTCHA solving, and advanced web access handling are included in each request.

How fast is Scrappey's API and what if a site is hard to scrape?

Scrappey's API is optimized for fast response time, even when dealing with complex or protected websites. If other scrapers struggle with sites that have advanced security measures, Scrappey is designed to handle these challenges efficiently, ensuring reliable data retrieval. Our advanced web access handling, residential proxies, and intelligent retry logic work together to maximize success rates.