The XHR shortcut
Before scrolling: check the network tab. Infinite scroll is almost always backed by a paginated JSON endpoint that the page fetches as you scroll. The endpoint takes a cursor or page parameter and returns the next batch. Hitting that endpoint directly is dramatically faster than running a browser — no rendering, no scroll loops, just paginated JSON. If the endpoint is open or only requires a CSRF token from the initial page, this is the right answer.
When you have to scroll
If the endpoint is signed, encrypted, or returns rendered HTML fragments, you need a real browser. The loop: get current scroll height, scroll to the bottom (or by a viewport step), wait for either a new item selector to appear or network idle, collect the new items, compare against the last iteration. If the count is unchanged for two iterations, the feed is done. Cap at a reasonable maximum (e.g., 200 iterations) to handle Twitter-style feeds that never truly end.
Pitfalls
Virtualized lists (react-window, react-virtual) remove off-screen items from the DOM as you scroll — by the time you reach the bottom, the top items are gone. You have to collect after each scroll step, not at the end. Some pages defer loading until the user has paused scrolling for a moment; insert a 500ms-2s pause after each scroll. Anti-bot systems flag mechanical scroll patterns (exact viewport steps, no jitter) — randomize the scroll delta and pause duration.
