Per-request vs. per-session rotation
The first decision is how often you change IPs, and it follows the workflow, not a preference. Per-request rotation picks a fresh proxy for every HTTP call. It's the right default for stateless scraping, where each URL stands alone and nothing carries over between requests (a product catalog, a list of independent pages). Spreading every call across the pool keeps any one IP well under the per-IP rate limit.
Per-session (sticky) rotation pins one IP to a sequence of requests. You need this whenever state carries forward: a login, a cart, paginated results behind a cookie, or anything where the server expects the same client across steps. A logged-in user keeps the same IP across a flow, so a sticky session keeps the cart, cookies, and session state consistent across the sequence of requests. With a list of your own proxies you implement stickiness by keeping the same proxy for a whole worker's run; with a rotating-residential gateway you append a session token to the username (commonly user-session-abc123) and the gateway holds that exit IP for a window like 1, 10, or 30 minutes.
A subtle trap with requests.Session: HTTP keep-alive pools a TCP connection, so once a session opens a socket to a rotating gateway, later requests can ride the same tunnel and hit the same exit node even though you configured per-request rotation. If you want a truly new IP per call against a gateway, send each request without reusing a keep-alive connection, or use one short-lived session per IP.
Building a pool and rotating with requests or curl_cffi
The minimal pattern is a list of proxy URLs plus a chooser. Round-robin (cycle the list in order) is predictable and easy to reason about; random selection avoids any IP-ordering pattern. For most jobs random is the safer default. Each proxy is passed as a proxies dict with http and https keys; the scheme on the value is the proxy's own protocol, so for an HTTP proxy both values start with http:// even for HTTPS targets.
- requests:
requests.get(url, proxies={"http": p, "https": p}, timeout=10). Pure Python, ubiquitous, but it sends a default Python TLS fingerprint, which differs from a browser's handshake. - curl_cffi: a drop-in-style client built on curl-impersonate that reproduces a real browser's TLS/JA3 handshake via
impersonate="chrome", useful when a server expects a browser-style handshake. Sameproxiesdict shape. A very common mistake is puttinghttps://on the value of thehttpskey; keep ithttp://for an HTTP proxy. - httpx / aiohttp: for async pools, where you fan out many requests across the proxy list concurrently.
Rotating IPs reduces per-IP rate-limit hits (a 429 means you sent too many requests from one address) and IP-reputation blocks (a 403 once an IP looks suspicious), because the next request comes from a clean IP. It only affects per-IP limits and IP reputation, not the TLS handshake or request patterns; for sites that expect a browser-style handshake, a client like curl_cffi sends one.
Health-checking and retiring dead proxies
A static list rots fast: proxies time out, get blocked, or return garbage. A production rotator treats the pool as live state. Before a run, probe each proxy against a cheap endpoint (an IP-echo service such as https://httpbin.org/ip or https://api.ipify.org) and keep only the ones that answer quickly with a valid response. During the run, score outcomes: a connection timeout, a 407 (proxy auth failed), repeated 403s, or a sudden 429 are signals to pull that IP out of rotation. Don't delete it permanently for a single 429 — put it on a cooldown timer and re-admit it later, since rate limits are temporary by design.
Practical rules that mirror good rotating-proxy hygiene: keep rotation inside one geography so a session doesn't jump countries mid-flow; size the pool so the request rate per IP stays well within a polite, sustainable pace for the target; and log which IPs succeed against which targets so you can tune instead of guessing. Once you're past roughly 50,000-100,000 requests a day, maintaining and health-checking your own pool starts costing more engineering time than it saves. At that point a single rotating-residential endpoint that picks and retires exit IPs for you is usually the cleaner move, and a managed web-data API goes further by handling proxy rotation, browser rendering, and retries behind one request so your code just asks for the page.
