Proxy Archive: Historical Snapshots
The ProxySpace.pro archive preserves a daily snapshot of every list we publish.
Today's live http.txt is updated every 20 minutes; the archive
keeps a frozen copy at the end of each day so you can rewind to any past date
for replay scans, longitudinal studies, or rebuilding a target list from a
known good moment in time.
What is archived
archive_http.txt— rolling archive of HTTP proxies, one snapshot per day.archive_https.txt— HTTPS (CONNECT-capable) proxies.archive_socks4.txt— SOCKS4 proxies.archive_socks5.txt— SOCKS5 proxies.
Snapshots are also exported in the alternative .space format
(archive_socks49.space,
archive_socks59.space) for
clients that prefer line-separated CIDR-aware lists.
Use cases
- Replay scrapes — rerun a research project against exactly the same set of exits you used last month. Reproducible scrape pipelines need reproducible proxy pools.
- Longitudinal proxy churn studies — measure how long the average free proxy survives, what countries dominate the live pool, and which ASNs leak the most open relays.
- Incident response — if your service was hit through a public proxy, find out which list contained that IP on the day of the event.
Format
Each line is a single host:port pair, optionally followed by
a tab-separated comment field. Lines starting with # are header
comments containing the snapshot date and the protocol bucket. Parsing it is
one regex away in any language:
# Python
import re
addrs = re.findall(r'(\d+\.\d+\.\d+\.\d+):(\d+)', open('archive_socks5.txt').read())
Retention
The archive currently holds rolling snapshots from the last several months. Older data can be requested directly via the contact channel. We rotate when the file grows past a sensible size; we don't expire entries on a fixed schedule.
For the always-fresh live lists, jump back to the about page or pull /http.txt, /https.txt, /socks4.txt, /socks5.txt directly.