URL finder

How to find all URLs on a domain — multiple methods

Discover every page on modern websites using powerful regex extraction and intelligent crawling. Map entire domains with automatic link discovery and domain filtering.

Free demo trial • No credit card required • Setup in <2 minutes

Why find all URLs on a domain?

Finding all URLs on a website is essential for web scraping, SEO audits, site migrations, and competitive analysis. Whether you're mapping a competitor's site structure, preparing for a redesign, or conducting a comprehensive SEO audit, having a complete URL inventory is crucial.

Our tool uses powerful regex patterns to extract all links from each page, then automatically crawls discovered URLs from the same domain. With concurrent processing and automatic web access handling, you can map entire websites efficiently.

Perfect for SEO professionals, web developers, and data analysts who need comprehensive site mapping without manual work.

URL Finder API
{
  "cmd": "request.get",
  "url": "https://example.com",
  "requestType": "request",
  "regex": "(?:href|src)=\"([^\"]+)\"|(?:href|src)='([^']+)'",
  "filter": ["regex"]
}

// Response:
{
  "solution": {
    "regex": [
      "/page1",
      "/page2",
      "https://example.com/page3"
    ]
  }
}

Find all URLs on any domain

Enter a starting URL and discover all pages automatically

Multiple methods to find URLs

Choose the method that works best for your needs

Google Search Operators

Use site: operator to find indexed pages. Quick but may miss unindexed pages.

QuickFreeLimited

Sitemap & robots.txt

Parse XML sitemaps and robots.txt files to discover all declared URLs.

ComprehensiveTechnicalReliable

SEO Crawling Tools

Use tools like ScreamingFrog or XML-Sitemaps.com for visual crawling.

GUIEasyLimited Free

Custom Scripting

Build your own crawler with Python, JavaScript, or other languages for full control.

FlexibleCustomTechnical

Unlock Full Functionality Without Limits

Register for a free account to access all tools with unlimited usage and advanced features.

Register & Access Full Tools

Why use Scrappey for URL discovery?

Fast, reliable, and optimized for comprehensive site mapping

Lightning Fast

Concurrent crawling with up to 5 simultaneous requests. Map large sites in minutes, not hours.

Automatic Discovery

Intelligently extracts all links from each page and follows same-domain URLs automatically.

Advanced Web Access

Automatic handling of CDN protection, bot management, and other web access challenges. Designed for high reliability.

Domain Filtering

Automatically filters links to only crawl URLs from the same domain, preventing external crawls.

Progress Tracking

Real-time progress updates showing discovered URLs, crawled pages, and current status.

Regex-Powered

Uses advanced regex patterns to extract all href and src attributes from HTML content.

Simple integration

Extract URLs with just a few lines of code

import requests
from urllib.parse import urljoin, urlparse
import re

API_KEY = "YOUR_API_KEY"
API_URL = "https://publisher.scrappey.com/api/v1"

def extract_domain(url):
    """Extract domain from URL"""
    return urlparse(url).netloc

def normalize_url(url, base_url):
    """Convert relative URLs to absolute"""
    if url.startswith('http'):
        return url
    return urljoin(base_url, url)

def find_urls_on_page(page_url, domain):
    """Use Scrappey to find all URLs on a page"""
    payload = {
        "cmd": "request.get",
        "url": page_url,
        "requestType": "request",
        "regex": "(?:href|src)="([^"]+)"|(?:href|src)='([^']+)'",
        "filter": ["regex"]
    }
    
    response = requests.post(f"{API_URL}?key={API_KEY}", json=payload)
    data = response.json()
    
    if data.get('solution', {}).get('regex'):
        urls = data['solution']['regex']
        # Filter to same domain
        same_domain = []
        for url in urls:
            normalized = normalize_url(url, page_url)
            if extract_domain(normalized) == domain:
                same_domain.append(normalized)
        return same_domain
    return []

# Start crawling
start_url = "https://example.com"
domain = extract_domain(start_url)
visited = set()
queue = [start_url]

while queue and len(visited) < 200:
    current_url = queue.pop(0)
    if current_url in visited:
        continue
    
    visited.add(current_url)
    print(f"Crawling: {current_url}")
    
    new_urls = find_urls_on_page(current_url, domain)
    for url in new_urls:
        if url not in visited and url not in queue:
            queue.append(url)
    
    print(f"Found {len(new_urls)} URLs, Total: {len(visited)}")

Perfect for

SEO Audits

Map entire websites to identify orphan pages, broken links, and site structure issues for comprehensive SEO analysis.

Competitor Analysis

Discover all pages on competitor websites to understand their content strategy and site architecture.

Site Migrations

Create complete URL inventories before website redesigns or platform migrations to ensure nothing is missed.

Content Discovery

Find all content pages, blog posts, and resources on a website for content analysis and research.

How to get all page URLs from a website

Our URL finder tool helps you crawl website for all URLs efficiently. Whether you need to get URLs for SEO analysis, site migration, or competitive research, this URL extractor makes it simple.

Getting URLs from a website is now easier than ever. Simply enter a starting URL and our tool will automatically discover and find all links to website pages. You can copy all URLs with a single click or download them as a CSV file.

This crawl list feature allows you to find all webpages on a site by following links automatically. Our intelligent list crawl system filters URLs to focus on the same domain, helping you how to find website links that matter most.

Perfect for developers, SEO professionals, and data analysts who need to find all webpages on a site quickly. The tool handles modern website complexity automatically, so you can crawl website for all urls without worrying about CAPTCHAs or JavaScript rendering.

footer-frame

Start building with Scrappey

Try It For Free. No Subscription Required. No Credit Card Required. Instant Set-Up. Your Free Trial Is Waiting For You!

Frequently asked questions

What is Scrappey.com?

Scrappey.com is a web scraping API that handles all the complex aspects of web scraping, such as handling dynamic content, rotating proxies, advanced request handling, headless browsers, and verification processing. It offers an all-in-one solution for extracting publicly available data from websites.

How does Scrappey.com work?

Scrappey.com provides a web scraping API that allows you to send requests to extract publicly available data from websites. It handles dynamic content and modern website complexity, including rotating proxies, advanced request handling, and verification processing. You can easily extract publicly available data from websites using their built-in features like headless browsers and AI-powered data extraction.

Can I customize the proxies used for scraping?

Yes, with Scrappey.com, you have the option to use Sticky Rotating Proxies for seamless scraping. Alternatively, you can also set your own proxies if desired.

Is there a free trial available?

Yes, Scrappey.com offers a free trial where you can try it out without a subscription or credit card. Instant setup is provided, so you can explore the full capabilities of the platform right away.

What happens if a request fails?

We only charge for successful requests. Failed requests are not counted towards your usage, so you only pay for what works.

I need to scroll or click on a button on the page I want to scrape

No problem, you can pass any JavaScript snippet that needs to be executed by using our JavaScript scenario parameter. This allows you to interact with dynamic content, scroll pages, click buttons, wait for elements, and perform any custom JavaScript actions before extracting the data.

What is the pricing structure for Scrappey.com?

Scrappey.com offers simple and transparent pricing: €0.20 per 1,000 direct HTTP requests and €1.00 per 1,000 full-browser requests. Residential proxies are included on both tiers — no separate proxy billing, no hidden fees, no complicated pricing tiers. You only pay for successful requests.

Are there any usage restrictions or limitations?

Scrappey.com provides scalable access for extracting publicly available data. Whether you need to extract data from a few pages or a large dataset of publicly accessible content, you can do so with flexible usage options. Please note that Scrappey.com only supports scraping publicly available data, and users must comply with applicable laws and website terms of service.

What support channels are available?

Scrappey.com provides various support channels for assistance. You can refer to their documentation, frequently asked questions section, blog, and uptime status page. Additionally, you can get in touch with them via email or join their Discord community for further support.

I'm not a developer, can you create custom scraping scripts for me?

We don't create custom scraping scripts, however we will gladly write some code snippets helping you to use our most powerful features: AI-powered data extraction and JavaScript scenario. Our documentation includes examples in multiple programming languages to get you started quickly.

What is a request and how are they counted?

Each API call to Scrappey counts as one request. Our pricing is based on successful requests. By default, JavaScript rendering is enabled, which allows you to extract data from modern websites with dynamic content. All features including proxies, challenge handling, and reliable web access handling are included in each request.

How fast is Scrappey's API and what if a site is hard to scrape?

Scrappey's API is optimized for fast response time, even when working with JavaScript-heavy websites and browser verification flows, where access is authorized. If other tools struggle with sites that use browser verification, Scrappey is designed to handle these workflows efficiently, ensuring reliable data retrieval. Our reliable web access handling, residential proxies, and intelligent retry logic work together to maximize success rates.