A structured learning path to deeply understand this production-grade HTTP server built from scratch.
This guide is designed for developers who want to understand how web servers work at a fundamental level. Follow this reading order to build knowledge progressively from foundational concepts to advanced patterns.
- Read in order - Each section builds on the previous
- Run the code - After each section, experiment with the components
- Check the ASCII diagrams - They visualize complex flows
- Review interview Q&A - Each file has relevant interview questions
- Take notes - Jot down concepts that are new to you
Understand what you're building and the networking basics
┌─────────────────────────────────────────────────────────────────────────┐
│ START HERE │
│ ────────── │
│ │
│ 1. main.py (root) │
│ └── THE MOST IMPORTANT FILE │
│ └── Contains ~550 lines of networking fundamentals │
│ └── OSI model, TCP/IP, sockets, 3-way handshake │
│ └── Read this like a textbook before anything else │
│ │
│ 2. src/httpserver/__init__.py │
│ └── Project overview and architecture diagram │
│ └── Package structure explanation │
│ └── Quick start example │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Key Concepts You'll Learn:
- OSI 7-layer model and TCP/IP 4-layer model
- How TCP works (reliable, ordered, connection-oriented)
- Socket programming fundamentals
- The 3-way handshake (SYN → SYN-ACK → ACK)
- Why we use sockets for network programming
Time Estimate: 30-45 minutes
How raw TCP connections are handled
┌─────────────────────────────────────────────────────────────────────────┐
│ CORE MODULE - src/httpserver/core/ │
│ ──────────────────────────────────── │
│ │
│ Read in this order: │
│ │
│ 3. core/__init__.py │
│ └── Module overview │
│ │
│ 4. core/socket_server.py │
│ └── TCP socket lifecycle │
│ └── socket(), bind(), listen(), accept() explained │
│ └── Accept loop implementation │
│ │
│ 5. core/connection.py │
│ └── Connection state machine │
│ └── Reading/writing bytes from socket │
│ └── Buffer management │
│ │
│ 6. core/thread_pool.py │
│ └── Why we need concurrency │
│ └── Thread pool pattern │
│ └── Task queue and worker threads │
│ └── Graceful shutdown │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Key Concepts You'll Learn:
- Socket lifecycle (create → bind → listen → accept → close)
- Connection state management
- Thread pool concurrency pattern
- Why threading vs async for this use case
- Graceful shutdown handling
Time Estimate: 45-60 minutes
How HTTP messages are parsed and built
┌─────────────────────────────────────────────────────────────────────────┐
│ HTTP MODULE - src/httpserver/http/ │
│ ────────────────────────────────── │
│ │
│ Read in this order: │
│ │
│ 7. http/__init__.py │
│ └── HTTP request-response cycle overview │
│ │
│ 8. http/status_codes.py │
│ └── HTTP status code categories (1xx-5xx) │
│ └── When to use each status code │
│ │
│ 9. http/request.py ⭐ CRITICAL FILE │
│ └── HTTP request anatomy │
│ └── Request line parsing (method, path, version) │
│ └── Header parsing │
│ └── Body handling (Content-Length, chunked) │
│ └── Query string parsing │
│ └── Security: path traversal, request smuggling │
│ │
│ 10. http/response.py ⭐ CRITICAL FILE │
│ └── HTTP response structure │
│ └── ResponseBuilder pattern │
│ └── JSON, HTML, file responses │
│ └── Redirect handling │
│ │
│ 11. http/mime_types.py │
│ └── Content-Type detection │
│ └── Why MIME types matter │
│ │
│ 12. http/router.py │
│ └── URL routing architecture │
│ └── Pattern matching with regex │
│ └── Dynamic path parameters (:id) │
│ └── Wildcard routes (*) │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Key Concepts You'll Learn:
- HTTP message format (request line, headers, body)
- HTTP methods (GET, POST, PUT, DELETE, etc.)
- Status code categories and when to use them
- Content negotiation
- URL routing patterns
- Builder design pattern
Time Estimate: 60-90 minutes
How to add features without modifying core code
┌─────────────────────────────────────────────────────────────────────────┐
│ MIDDLEWARE MODULE - src/httpserver/middleware/ │
│ ───────────────────────────────────────────── │
│ │
│ Read in this order: │
│ │
│ 13. middleware/__init__.py │
│ └── Middleware pipeline diagram │
│ └── What are cross-cutting concerns? │
│ │
│ 14. middleware/base.py ⭐ DESIGN PATTERN │
│ └── Chain of Responsibility pattern │
│ └── Middleware contract (before/after) │
│ └── How the pipeline wraps handlers │
│ │
│ 15. middleware/logging.py │
│ └── Request logging implementation │
│ └── X-Request-ID for distributed tracing │
│ └── Apache log format vs JSON │
│ │
│ 16. middleware/cors.py │
│ └── What is CORS and why it exists │
│ └── Preflight requests (OPTIONS) │
│ └── CORS headers explained │
│ │
│ 17. middleware/compression.py │
│ └── gzip compression │
│ └── Content negotiation (Accept-Encoding) │
│ └── When to compress │
│ │
│ 18. middleware/rate_limit.py ⭐ ALGORITHM │
│ └── Token Bucket algorithm (interview favorite!) │
│ └── Rate limiting strategies │
│ └── Distributed rate limiting concepts │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Key Concepts You'll Learn:
- Chain of Responsibility design pattern
- Middleware architecture
- CORS (Cross-Origin Resource Sharing)
- Token Bucket rate limiting algorithm
- HTTP compression (gzip)
- Distributed tracing with request IDs
Time Estimate: 45-60 minutes
Practical request handlers
┌─────────────────────────────────────────────────────────────────────────┐
│ HANDLERS MODULE - src/httpserver/handlers/ │
│ ───────────────────────────────────────── │
│ │
│ 19. handlers/__init__.py │
│ └── Handler types overview │
│ │
│ 20. handlers/health.py │
│ └── Kubernetes liveness vs readiness probes │
│ └── Health check patterns │
│ └── Why Cache-Control: no-store? │
│ │
│ 21. handlers/static.py │
│ └── Static file serving │
│ └── Path traversal attack prevention │
│ └── ETag caching │
│ └── HTTP caching headers │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Key Concepts You'll Learn:
- Kubernetes health probes (liveness, readiness)
- Static file serving security
- HTTP caching (ETag, Last-Modified, Cache-Control)
- Path traversal attack prevention
Time Estimate: 30 minutes
The orchestration layer
┌─────────────────────────────────────────────────────────────────────────┐
│ SERVER ORCHESTRATION │
│ ─────────────────── │
│ │
│ 22. config.py │
│ └── 12-factor app configuration │
│ └── Environment variables │
│ └── Configuration validation │
│ │
│ 23. server.py ⭐ THE HEART │
│ └── How all components connect │
│ └── Request lifecycle (accept → parse → route → respond) │
│ └── Keep-alive connection handling │
│ └── Error handling │
│ │
│ 24. __main__.py │
│ └── CLI entry point │
│ └── Argument parsing │
│ └── Application bootstrap │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Key Concepts You'll Learn:
- How all components integrate
- Request lifecycle end-to-end
- Configuration management
- CLI design patterns
Time Estimate: 30-45 minutes
START
│
▼
┌─────────────────┐
│ main.py │ ← Networking Fundamentals
│ (textbook) │
└────────┬────────┘
│
▼
┌────────────────────┴────────────────────┐
│ │
▼ ▼
┌───────────────┐ ┌───────────────┐
│ core/ │ ← Sockets, Threads │ http/ │ ← Protocol
│ socket_server │ │ request │
│ connection │ │ response │
│ thread_pool │ │ router │
└───────┬───────┘ └───────┬───────┘
│ │
└────────────────┬───────────────────────┘
│
▼
┌─────────────────────┐
│ middleware/ │ ← Cross-cutting
│ logging, cors, │
│ rate_limit, gzip │
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ handlers/ │ ← Business Logic
│ health, static │
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ server.py │ ← Orchestration
│ (ties it all) │
└──────────┬──────────┘
│
▼
DONE!
| Phase | Topic | Time |
|---|---|---|
| 1 | Foundation (main.py, init.py) | 30-45 min |
| 2 | Core (sockets, connections, threads) | 45-60 min |
| 3 | HTTP (request, response, routing) | 60-90 min |
| 4 | Middleware (logging, CORS, rate limit) | 45-60 min |
| 5 | Handlers (health, static) | 30 min |
| 6 | Orchestration (config, server) | 30-45 min |
| Total | 4-6 hours |
After reading this project, you should understand:
- OSI model layers and TCP/IP model
- TCP vs UDP differences
- 3-way handshake
- Socket API (socket, bind, listen, accept, recv, send)
- Connection states
- HTTP request format (request line, headers, body)
- HTTP response format (status line, headers, body)
- HTTP methods and when to use them
- Status code categories (1xx-5xx)
- HTTP/1.1 keep-alive
- Thread pool pattern
- Task queues
- Thread synchronization (locks, conditions)
- Graceful shutdown
- Chain of Responsibility (middleware)
- Builder (response building)
- Factory (handlers)
- Strategy (routing)
- Path traversal prevention
- Request size limits
- Rate limiting
- CORS
- Health checks (liveness, readiness)
- Structured logging
- Configuration management
- HTTP caching
After reading, try these exercises to solidify your understanding:
- Run the server and make requests with
curl - Add a new route - Create a
/api/timeendpoint - Modify logging - Add a custom field to the log output
- Create custom middleware - Add a request ID validation middleware
- Add a new status code handler - Implement 429 retry logic
- Extend static handler - Add support for range requests
- Implement basic auth middleware - Check Authorization header
- Add metrics middleware - Count requests per endpoint
- Implement request caching - Cache GET responses in memory
Each file contains interview Q&A sections. Here are the must-know topics:
- From main.py: TCP vs UDP, 3-way handshake, socket lifecycle
- From thread_pool.py: Thread pool pattern, why not unlimited threads
- From request.py: HTTP parsing, security (request smuggling)
- From rate_limit.py: Token Bucket algorithm
- From cors.py: Same-origin policy, preflight requests
- From health.py: Liveness vs readiness probes
- RFC 7230 - HTTP/1.1 Message Syntax
- RFC 7231 - HTTP/1.1 Semantics
- Python socket documentation
- Python threading documentation
Happy Learning! 🎉