Why HTML-only scraping fails on SPAs
A single-page app serves a near-empty HTML shell — the real content is rendered client-side by JavaScript that fetches data from an API and updates the DOM. A plain HTTP fetch sees the shell, not the content. Scraping these sites requires executing the JavaScript, waiting for the DOM updates to settle, and only then capturing the HTML. That is exactly what a JS-rendering scraping API does for you.
What to look for
Real browser engine (Chromium, Firefox), not a JS shim. Configurable wait strategies — wait for a CSS selector, wait for network idle, wait for a custom JS predicate. Support for scrolling and clicking to trigger lazy-loaded content. Per-request proxy and fingerprint control. Network capture so you can grab the underlying XHR data directly (often cleaner than re-extracting from rendered HTML). And cost transparency — JS rendering is more expensive than HTML fetch, so you want to render only when needed.
When NOT to use a rendering API
If the SPA fetches its data from a JSON endpoint you can identify, hitting that endpoint directly is faster, cheaper, and more reliable than rendering. Open the network tab, find the XHR that returns the data, replicate the request. The rendering API is the fallback when the endpoint is encrypted, signed, or otherwise impractical to call directly.
