A non-blocking HTTP proxy server with disk-based response caching, written in Python using only the standard library.
Requests are made by visiting http://localhost:<port>/<target-host>/<path> in your browser. The proxy forwards the request to the target host, stores the response on disk, and serves it from cache on subsequent requests until the cache TTL expires.
- Non-blocking I/O via
select()— handles many concurrent browser requests - Disk-based response caching with configurable TTL
- Configurable listen port, upstream port, and bind address
- Structured log output (INFO by default, DEBUG with
--verbose) - Graceful shutdown on Ctrl-C
- Zero external dependencies — pure Python 3 standard library
- Python 3.7 or later
- No third-party packages needed
The Folder/ directory contains a test page with 272 images — great for observing cache behavior.
cd Folder
python -m http.server 8000Leave this running in its own terminal.
# Cache responses for 60 seconds, connect to upstream on port 8000
python proxy.py 60 --upstream-port 8000http://localhost:8888/localhost/
The proxy fetches localhost:8000/ and logs each request. Reload within 60 seconds to see cache hits. After the TTL expires, the proxy re-fetches from the upstream.
python proxy.py <stale_time> [options]
| Argument | Description |
|---|---|
stale_time |
Cache TTL in seconds. Use 0 to disable caching (always fetch fresh). |
| Flag | Default | Description |
|---|---|---|
--port |
8888 |
Port the proxy listens on |
--upstream-port |
80 |
Port used when connecting to upstream servers |
--host |
127.0.0.1 |
Address to bind to. Use 0.0.0.0 to accept connections from other machines on your network. |
--cache-dir |
./cache |
Directory where cached responses are stored |
--verbose, -v |
off | Enable debug-level logging |
# 5-minute cache on default port 8888
python proxy.py 300
# 1-hour cache on a custom port
python proxy.py 3600 --port 9090
# No caching — always fetch from upstream
python proxy.py 0
# Test site running on port 8000 instead of 80
python proxy.py 120 --upstream-port 8000
# Store cache in a specific directory
python proxy.py 300 --cache-dir /tmp/proxycache
# Debug logging
python proxy.py 60 --verboseThis proxy uses an embedded-host URL scheme rather than the standard proxy protocol. You visit URLs in the form:
http://localhost:<port>/<target-host>/<path>
| Example URL | What it fetches |
|---|---|
http://localhost:8888/example.com/index.html |
example.com/index.html on port 80 |
http://localhost:8888/localhost/ |
localhost/ on the configured upstream port |
http://localhost:8888/myserver.local/api/data |
myserver.local/api/data on port 80 |
Note: This proxy only supports plain HTTP (port 80). HTTPS targets are not supported.
simple_proxy.py is a minimal blocking proxy that handles one request at a time with no caching. It is useful as a reference implementation or for simple debugging.
python simple_proxy.py
python simple_proxy.py --port 9999
python simple_proxy.py --upstream-port 8000The Folder/ directory contains a self-hosted test website:
index.html— a grid of 272 images (JPG + GIF)images/— image assets served as binary content
Its purpose is to generate many parallel HTTP GET requests so you can observe cache misses on first load vs. cache hits on reload.
Setup:
# Terminal 1 — test website
cd Folder
python -m http.server 8000
# Terminal 2 — proxy with 30-second TTL
python proxy.py 30 --upstream-port 8000Open http://localhost:8888/localhost/ and watch the proxy log. Reload within 30 seconds — all 272 images should be served from cache instantly. Wait 30 seconds and reload again to see them fetched fresh.
Browser ──GET /localhost/index.html──> Proxy (port 8888)
│
cache miss? fetch from upstream
│
Upstream Server (port 8000)
│
store response to ./cache/
│
Browser <──────────── HTTP response ────────┘
The proxy uses a single-threaded event loop built on select():
- Accept — new browser connections are registered for reading
- Parse — once a full HTTP request arrives (
\r\n\r\n), check the cache - Cache hit — serve the stored response directly from disk
- Cache miss — open a non-blocking connection to the upstream, send the GET request, accumulate the full response, store it to disk, then forward to the browser
- Partial sends — large responses that can't be sent in one call are queued and continued when the socket becomes writable again
To rename the repo on GitHub: Settings → General → Repository name → set it to httpcache-proxy (or your preferred name), then update your local remote:
git remote set-url origin https://github.com/<your-username>/httpcache-proxy.gitMIT — do whatever you like with it.