Proxy Archive: Historical Snapshots

The ProxySpace.pro archive preserves a daily snapshot of every list we publish. Today's live http.txt is updated every 20 minutes; the archive keeps a frozen copy at the end of each day so you can rewind to any past date for replay scans, longitudinal studies, or rebuilding a target list from a known good moment in time.

What is archived

Snapshots are also exported in the alternative .space format (archive_socks49.space, archive_socks59.space) for clients that prefer line-separated CIDR-aware lists.

Use cases

  1. Replay scrapes — rerun a research project against exactly the same set of exits you used last month. Reproducible scrape pipelines need reproducible proxy pools.
  2. Longitudinal proxy churn studies — measure how long the average free proxy survives, what countries dominate the live pool, and which ASNs leak the most open relays.
  3. Incident response — if your service was hit through a public proxy, find out which list contained that IP on the day of the event.

Format

Each line is a single host:port pair, optionally followed by a tab-separated comment field. Lines starting with # are header comments containing the snapshot date and the protocol bucket. Parsing it is one regex away in any language:

# Python
import re
addrs = re.findall(r'(\d+\.\d+\.\d+\.\d+):(\d+)', open('archive_socks5.txt').read())

Retention

The archive currently holds rolling snapshots from the last several months. Older data can be requested directly via the contact channel. We rotate when the file grows past a sensible size; we don't expire entries on a fixed schedule.

For the always-fresh live lists, jump back to the about page or pull /http.txt, /https.txt, /socks4.txt, /socks5.txt directly.