The XHR shortcut
Before you bother scrolling, open your browser's network tab (the DevTools panel that lists every request the page makes). Infinite scroll is almost always powered by a paginated JSON endpoint that the page calls in the background (an XHR — a JavaScript request that fetches data without reloading the page) as you scroll. That endpoint takes a cursor or page parameter and returns the next batch of items. Calling it directly is far faster than driving a browser — no rendering, no scroll loops, just JSON you page through. If the endpoint is open, or only needs a CSRF token (a small anti-forgery value) grabbed from the first page, this is the best route.
When you have to scroll
If that endpoint is signed, encrypted, or hands back ready-made HTML fragments instead of clean data, you need a real browser. The loop goes: read the current scroll height, scroll to the bottom (or down by one screen at a time), wait until either a new item appears or the network goes quiet, collect the new items, then compare against the previous round. If the item count hasn't changed for two rounds in a row, the feed is done. Cap it at a sensible maximum (for example 200 rounds) so Twitter-style feeds that never truly end don't run forever.
Pitfalls
Virtualized lists (libraries like react-window or react-virtual) drop off-screen items out of the page's HTML as you scroll — so by the time you reach the bottom, the top items are already gone. The fix is to collect after each scroll step, not just at the end. Some pages also wait until the user has paused for a moment before loading more, so add a 500ms-2s pause after each scroll. Finally, anti-bot systems flag mechanical scrolling (identical screen-sized jumps with no variation), so randomize how far you scroll and how long you pause.
