diff --git a/CHANGELOG.md b/CHANGELOG.md
index c2119f5..b650103 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,6 +5,192 @@ This project adheres to [Semantic Versioning](https://semver.org/).
---
+## [0.2.0] — Remediation Release
+
+This release resolves all 40 issues identified in the 2026-03-20 comprehensive security and reliability audit. Changes are grouped by the audit's five severity phases.
+
+---
+
+### Phase 1 — Critical Security & Correctness
+
+#### 1.1 — Config Path Traversal: `site.directory` and `logging.file` Validated
+
+`src/config/loader.rs` — `validate()` now rejects any `site.directory` or `logging.file` value that is an absolute path, contains a `..` component, or contains a platform path separator. The process exits with a clear validation error before binding any port. Previously, a value such as `directory = "../../etc"` caused the HTTP server to serve the entire `/etc` tree, and a value such as `../../.ssh/authorized_keys` for `logging.file` caused log lines to be appended to the SSH authorized keys file.
+
+#### 1.2 — Race Condition: Tor Captures Bound Port via `oneshot` Channel
+
+`src/runtime/lifecycle.rs`, `src/server/mod.rs` — The 50 ms sleep that was the sole synchronisation barrier between the HTTP server binding its port and the Tor subsystem reading that port has been replaced with a `tokio::sync::oneshot` channel. The server sends the actual bound port through the channel before entering the accept loop; `tor::init` awaits that value (with a 10-second timeout) rather than reading a potentially-zero value out of `SharedState`. Previously, on a loaded system the race could be lost silently, causing every inbound Tor connection to fail with `ECONNREFUSED` to port 0 while the dashboard displayed a healthy green `TorStatus::Ready`.
+
+#### 1.3 — XSS in Directory Listing via Unsanitised Filenames
+
+`src/server/handler.rs` — `build_directory_listing()` now HTML-entity-escapes all filenames before interpolating them into link text (`&` → `&`, `<` → `<`, `>` → `>`, `"` → `"`, `'` → `'`) and percent-encodes filenames in `href` attribute values. Previously, a file named `">` produced an executable XSS payload in any directory listing page.
+
+#### 1.4 — HEAD Requests No Longer Receive a Response Body
+
+`src/server/handler.rs` — `parse_path()` now returns `(method, path)` instead of only the path. The method is threaded through to `write_response()` via a `suppress_body: bool` parameter. For `HEAD` requests, response headers (including `Content-Length` reflecting the full body size, as required by RFC 7231 §4.3.2) are written, but the body is not sent.
+
+#### 1.5 — Request Timeout Prevents Slow-Loris DoS
+
+`src/server/handler.rs` — The call to `read_request()` is now wrapped in `tokio::time::timeout(Duration::from_secs(30))`. Connections that fail to deliver a complete request header within 30 seconds receive a `408 Request Timeout` response and are closed. The timeout is also configurable via `[server] request_timeout_secs` in `settings.toml`. Timeout events are logged at `debug` level to avoid log flooding under attack.
+
+#### 1.6 — Unbounded Connection Spawning Replaced with Semaphore
+
+`src/server/mod.rs`, `src/tor/mod.rs` — Both the HTTP accept loop and the Tor stream request loop now use a `tokio::sync::Semaphore` to cap concurrent connections. The limit is configurable via `[server] max_connections` (default: 256). The semaphore `OwnedPermit` is held for the lifetime of each connection task and released on drop. When the limit is reached, the accept loop suspends naturally, providing backpressure; a `warn`-level log entry is emitted. Previously, unlimited concurrent connections could exhaust task stack memory and file descriptors.
+
+#### 1.7 — Files Streamed Instead of Read Entirely Into Memory
+
+`src/server/handler.rs` — `tokio::fs::read` (which loads the entire file into a `Vec`) has been replaced with `tokio::fs::File::open` followed by `tokio::io::copy(&mut file, &mut stream)`. File size is obtained via `file.metadata().await?.len()` for the `Content-Length` header. Memory consumption per connection is now bounded by the kernel socket buffer (~128–256 KB) regardless of file size. For `HEAD` requests, the file is opened only to read its size; the `copy` step is skipped.
+
+#### 1.8 — `strip_timestamp` No Longer Panics on Non-ASCII Log Lines
+
+`src/console/dashboard.rs` — `strip_timestamp()` previously used a byte index derived from iterating `.bytes()` to slice a `&str`, which panicked when the index fell inside a multi-byte UTF-8 character. The implementation now uses `splitn(3, ']')` to strip the leading `[LEVEL]` and `[HH:MM:SS]` tokens, which is both panic-safe and simpler. Any log line containing Unicode characters (Arti relay names, internationalized filenames, `.onion` addresses) is handled correctly.
+
+#### 1.9 — `TorStatus` Updated to `Failed` When Onion Service Terminates
+
+`src/tor/mod.rs` — When `stream_requests.next()` returns `None` (the onion service stream ends unexpectedly), the status is now set to `TorStatus::Failed("stream ended".to_string())` and the `onion_address` field is cleared from `AppState`. Previously, the dashboard permanently displayed a healthy green badge and the `.onion` address after the service had silently stopped serving traffic.
+
+#### 1.10 — Terminal Fully Restored on All Exit Paths; Panic Hook Registered
+
+`src/main.rs`, `src/console/mod.rs` — The error handler in `main.rs` now calls `console::cleanup()` (which issues `cursor::Show` and `terminal::LeaveAlternateScreen` before `disable_raw_mode`) on all failure paths. A `std::panic::set_hook` registered at startup ensures the same cleanup runs even when a panic occurs on an async executor thread. `console::cleanup()` is idempotent (guarded by a `RAW_MODE_ACTIVE` atomic swap), so calling it from multiple paths is safe.
+
+---
+
+### Phase 2 — High Priority Reliability
+
+#### 2.1 — HTTP Request Reading Buffered with `BufReader`
+
+`src/server/handler.rs` — `read_request()` previously read one byte at a time, issuing up to 8,192 individual `read` syscalls per request. The stream is now wrapped in `tokio::io::BufReader` and reads headers line-by-line with `read_line()`. The 8 KiB header size limit is enforced by accumulating total bytes read. This also correctly handles `\r\n\r\n` split across TCP segments.
+
+#### 2.2 — `scan_site` is Now Recursive, Error-Propagating, and Non-Blocking
+
+`src/server/mod.rs`, `src/runtime/lifecycle.rs`, `src/runtime/events.rs` — `scan_site` now performs a breadth-first traversal using a `VecDeque` work queue, counting files and sizes in all subdirectories. The return type is now `Result<(u32, u64)>`; errors from `read_dir` are propagated and logged at `warn` level rather than silently returning `(0, 0)`. All call sites wrap the function in `tokio::task::spawn_blocking` to avoid blocking the async executor on directory I/O.
+
+#### 2.3 — `canonicalize()` Called Once at Startup, Not Per Request
+
+`src/server/mod.rs`, `src/server/handler.rs` — The site root is now canonicalized once in `server::run()` and passed as a pre-computed `PathBuf` into each connection handler. The per-request `site_root.canonicalize()` call in `resolve_path()` has been removed, eliminating a `realpath()` syscall on every request.
+
+#### 2.4 — `open_browser` Deduplicated
+
+`src/runtime/lifecycle.rs`, `src/runtime/events.rs`, `src/runtime/mod.rs` — The `open_browser` function was duplicated in `lifecycle.rs` and `events.rs`. It now lives in a single location (`src/runtime/mod.rs`) and both call sites use the shared implementation.
+
+#### 2.5 — `#[serde(deny_unknown_fields)]` on All Config Structs
+
+`src/config/mod.rs` — All `#[derive(Deserialize)]` config structs (`Config`, `ServerConfig`, `SiteConfig`, `TorConfig`, `LoggingConfig`, `ConsoleConfig`, `IdentityConfig`) now carry `#[serde(deny_unknown_fields)]`. A misspelled key such as `bund = "127.0.0.1"` now causes a startup error naming the unknown field rather than silently using the compiled-in default.
+
+#### 2.6 — `auto_reload` Removed (Was Unimplemented)
+
+`src/config/mod.rs`, `src/config/defaults.rs` — The `auto_reload` field was present in the config struct and advertised in the default `settings.toml` but had no implementation. It has been removed entirely. The `[R]` key for manual site stat reloads is unaffected.
+
+#### 2.7 — ANSI Terminal Injection Prevention Documented and Tested
+
+`src/config/loader.rs` — The existing `char::is_control` check on `instance_name` (which covers ESC `\x1b`, NUL `\x00`, BEL `\x07`, and BS `\x08`) is confirmed to prevent terminal injection. An explicit comment now documents the security intent, and dedicated test cases cover each injection vector.
+
+#### 2.8 — Keyboard Input Task Failure Now Detected and Reported
+
+`src/runtime/lifecycle.rs` — If the `spawn_blocking` input task exits (causing `key_rx` to close), `recv().await` returning `None` is now detected. A `warn`-level log entry is emitted ("Console input task exited — keyboard input disabled. Use Ctrl-C to quit.") and subsequent iterations no longer attempt to receive from the closed channel. Previously, input task death was completely silent.
+
+#### 2.9 — `TorStatus::Failed` Now Carries a Reason String
+
+`src/runtime/state.rs`, `src/console/dashboard.rs` — `TorStatus::Failed(Option)` (the exit code variant, which was never constructed) has been replaced with `TorStatus::Failed(String)`. Construction sites pass a brief reason string (`"bootstrap failed"`, `"stream ended"`, `"launch failed"`). The dashboard now renders `FAILED (reason) — see log for details` instead of a bare `FAILED`.
+
+#### 2.10 — Graceful Shutdown Uses `JoinSet` and Proper Signalling
+
+`src/runtime/lifecycle.rs`, `src/server/mod.rs`, `src/tor/mod.rs` — The 300 ms fixed sleep that gated shutdown has been replaced with proper task completion signalling. A clone of `shutdown_rx` is passed into `tor::init()`; the Tor run loop watches it via `tokio::select!` and exits cleanly on shutdown. In-flight HTTP connection tasks are tracked in a `JoinSet`; after the accept loop exits, `join_set.join_all()` is awaited with a 5-second timeout, allowing in-progress transfers to complete before the process exits.
+
+#### 2.11 — Log File Flushed on Graceful Shutdown
+
+`src/logging/mod.rs`, `src/runtime/lifecycle.rs` — A `pub fn flush()` function has been added to the logging module. The shutdown sequence calls it explicitly after the connection drain wait, ensuring all buffered log entries (including the `"RustHost shut down cleanly."` sentinel) are written to disk before the process exits.
+
+---
+
+### Phase 3 — Performance
+
+#### 3.1 — `data_dir()` Computed Once at Startup
+
+`src/runtime/lifecycle.rs` — `data_dir()` (which calls `std::env::current_exe()` internally) was previously called on every key event dispatch inside `event_loop`. It is now computed exactly once at the top of `normal_run()`, stored in a local variable, and passed as a parameter to all functions that need it.
+
+#### 3.2 — `Arc` and `Arc` Eliminate Per-Connection Heap Allocations
+
+`src/server/mod.rs`, `src/server/handler.rs` — `site_root` and `index_file` are now wrapped in `Arc` and `Arc` respectively before the accept loop. Each connection task receives a cheap `Arc` clone (reference-count increment) rather than a full heap allocation.
+
+#### 3.3 — Dashboard Render Task Skips Redraws When Output Is Unchanged
+
+`src/console/mod.rs` — The render task now compares the rendered output string against the previously written string. If identical, the `execute!` and `write_all` calls are skipped entirely. This eliminates terminal writes on idle ticks, which is the common case for a server with no active traffic.
+
+#### 3.4 — MIME Lookup No Longer Allocates a `String` Per Request
+
+`src/server/mime.rs` — The `for_extension` function previously called `ext.to_ascii_lowercase()`, allocating a heap `String` on every request. The comparison now uses `str::eq_ignore_ascii_case` directly against the extension string, with no allocation.
+
+#### 3.5 — Log Ring Buffer Lock Not Held During `String` Clone
+
+`src/logging/mod.rs` — The log line string is now cloned before acquiring the ring buffer mutex. The mutex is held only for the `push_back` of the already-allocated string, reducing lock contention from Arti's multi-threaded internal logging.
+
+#### 3.6 — Tokio Feature Flags Made Explicit
+
+`Cargo.toml` — `tokio = { features = ["full"] }` has been replaced with an explicit feature list: `rt-multi-thread`, `net`, `io-util`, `fs`, `sync`, `time`, `macros`, `signal`. Unused features (`process`, `io-std`) are no longer compiled, reducing binary size and build time.
+
+---
+
+### Phase 4 — Architecture & Design
+
+#### 4.1 — Typed `AppError` Enum Introduced
+
+`src/error.rs` (new), `src/main.rs`, all modules — The global `Box` result alias has been replaced with a typed `AppError` enum using `thiserror`. Variants: `ConfigLoad`, `ConfigValidation`, `LogInit`, `ServerBind { port, source }`, `Tor`, `Io`, `Console`. Error messages now preserve structured context at the type level.
+
+#### 4.2 — Config Structs Use Typed Fields
+
+`src/config/mod.rs`, `src/config/loader.rs` — `LoggingConfig.level` is now a `LogLevel` enum (`Trace` | `Debug` | `Info` | `Warn` | `Error`) with `#[serde(rename_all = "lowercase")]`; the duplicate validation in `loader.rs` and `logging/mod.rs` has been removed. `ServerConfig.bind` is now `std::net::IpAddr` via `#[serde(try_from = "String")]`. The parse-then-validate pattern is eliminated in favour of deserialisation-time typing.
+
+#### 4.3 — Dependency Log Noise Filtered by Default
+
+`src/logging/mod.rs` — `RustHostLogger::enabled()` now suppresses `Info`-and-below records from non-`rusthost` targets (Arti, Tokio internals). Warnings and errors from all crates are still passed through. This prevents the ring buffer and log file from being flooded with Tor bootstrap noise. Configurable via `[logging] filter_dependencies = true` (default `true`); set `false` to pass all crate logs at the configured level.
+
+#### 4.4 — `data_dir()` Free Function Eliminated; Path Injected
+
+`src/runtime/lifecycle.rs` and all callers — The `data_dir()` free function (which called `current_exe()` as a hidden dependency) has been removed. The data directory `PathBuf` is now a first-class parameter threaded through the call chain from `normal_run`, enabling test injection of temporary directories.
+
+#### 4.5 — `percent_decode` Correctly Handles Multi-Byte UTF-8 and Null Bytes
+
+`src/server/handler.rs` — The previous implementation decoded each `%XX` token as a standalone `char` cast from a `u8`, producing incorrect output for multi-byte sequences (e.g., `%C3%A9` was decoded as two garbage characters instead of `é`). The function now accumulates consecutive decoded bytes into a `Vec` buffer and flushes via `String::from_utf8_lossy` when a literal character is encountered, correctly reassembling multi-byte sequences. Null bytes (`%00`) are left as the literal string `%00` in the output rather than being decoded.
+
+#### 4.6 — `deny.toml` Updated with All Duplicate Crate Skip Entries
+
+`deny.toml` — Five duplicate crate version pairs that were absent from `bans.skip` but present in the lock file have been added with comments identifying the dependency trees that pull each version: `foldhash`, `hashbrown`, `indexmap`, `redox_syscall`, and `schemars`. `cargo deny check` now passes cleanly.
+
+#### 4.7 — `ctrlc` Crate Replaced with `tokio::signal`
+
+`Cargo.toml`, `src/runtime/lifecycle.rs` — The `ctrlc = "3"` dependency has been removed. Signal handling is now done via `tokio::signal::ctrl_c()` (cross-platform) and `tokio::signal::unix::signal(SignalKind::interrupt())` (Unix), integrated directly into the `select!` inside `event_loop`. This eliminates threading concerns between the `ctrlc` crate's signal handler and Tokio's internal signal infrastructure.
+
+---
+
+### Phase 5 — Testing, Observability & Hardening
+
+#### 5.1 — Unit Tests Added for All Security-Critical Functions
+
+`src/server/handler.rs`, `src/server/mod.rs`, `src/config/loader.rs`, `src/console/dashboard.rs`, `src/tor/mod.rs` — `#[cfg(test)]` modules added to each file. Coverage includes: `percent_decode` (ASCII, spaces, multi-byte UTF-8, null bytes, incomplete sequences, invalid hex); `resolve_path` (normal file, directory traversal, encoded-slash traversal, missing file, missing root); `validate` (valid config, `site.directory` path traversal, absolute path, `logging.file` traversal, port 0, invalid IP, unknown field); `strip_timestamp` (ASCII line, multi-byte UTF-8 line, line with no brackets); `hsid_to_onion_address` (known test vector against reference implementation).
+
+#### 5.2 — Integration Tests Added for HTTP Server Core Flows
+
+`tests/http_integration.rs` (new) — Integration tests using `tokio::net::TcpStream` against a test server bound on port 0. Covers: `GET /index.html` → 200; `HEAD /index.html` → correct `Content-Length`, no body; `GET /` with `index_file` configured; `GET /../etc/passwd` → 403; request header > 8 KiB → 400; `GET /nonexistent.txt` → 404; `POST /index.html` → 400.
+
+#### 5.3 — Security Response Headers Added to All Responses
+
+`src/server/handler.rs` — All responses now include `X-Content-Type-Options: nosniff`, `X-Frame-Options: SAMEORIGIN`, `Referrer-Policy: no-referrer`, and `Permissions-Policy: camera=(), microphone=(), geolocation=()`. HTML responses additionally include `Content-Security-Policy: default-src 'self'` (configurable via `[server] content_security_policy` in `settings.toml`). The `Referrer-Policy: no-referrer` header is especially relevant for the Tor onion service: it prevents the `.onion` URL from leaking in the `Referer` header to any third-party resources loaded by served HTML.
+
+#### 5.4 — Accept Loop Error Handling Uses Exponential Backoff
+
+`src/server/mod.rs` — The accept loop previously retried immediately on error, producing thousands of log entries per second on persistent errors such as `EMFILE`. Errors now trigger exponential backoff (starting at 1 ms, doubling up to 1 second). `EMFILE` is logged at `error` level (operator intervention required); transient errors (`ECONNRESET`, `ECONNABORTED`) are logged at `debug`. The backoff counter resets on successful accept.
+
+#### 5.5 — CLI Arguments Added (`--config`, `--data-dir`, `--version`, `--help`)
+
+`src/main.rs`, `src/runtime/lifecycle.rs` — The binary now accepts `--config ` and `--data-dir ` to override the default config and data directory paths (previously inferred from `current_exe()`). `--version` prints the crate version and exits. `--help` prints a usage summary. These flags enable multi-instance deployments, systemd unit files with explicit paths, and CI test runs without relying on the working directory.
+
+#### 5.6 — `cargo deny check` Passes Cleanly; `audit.toml` Consolidated
+
+`deny.toml`, CI — `audit.toml` (which suppressed `RUSTSEC-2023-0071` without a documented rationale) has been removed. Advisory suppression is now managed exclusively in `deny.toml`, which carries the full justification. CI now runs `cargo deny check` as a required step, subsuming the advisory check. The existing rationale for `RUSTSEC-2023-0071` is unchanged: the `rsa` crate is used only for signature verification on Tor directory documents, not for decryption; the Marvin timing attack's threat model does not apply.
+
+---
+
## [0.1.0] — Initial Release
### HTTP Server
diff --git a/Cargo.lock b/Cargo.lock
index bd9fe3f..4e1ea7b 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -375,15 +375,6 @@ dependencies = [
"generic-array",
]
-[[package]]
-name = "block2"
-version = "0.6.2"
-source = "registry+https://github.com/rust-lang/crates.io-index"
-checksum = "cdeb9d870516001442e364c5220d3574d2da8dc765554b4a617230d33fa58ef5"
-dependencies = [
- "objc2",
-]
-
[[package]]
name = "bstr"
version = "1.12.1"
@@ -455,12 +446,6 @@ version = "1.0.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9330f8b2ff13f34540b44e946ef35111825727b38d33286ef986142615121801"
-[[package]]
-name = "cfg_aliases"
-version = "0.2.1"
-source = "registry+https://github.com/rust-lang/crates.io-index"
-checksum = "613afe47fcd5fac7ccf1db93babcb082c5994d996f20b8b159f2ad1658eb5724"
-
[[package]]
name = "chrono"
version = "0.4.44"
@@ -776,17 +761,6 @@ dependencies = [
"cipher",
]
-[[package]]
-name = "ctrlc"
-version = "3.5.2"
-source = "registry+https://github.com/rust-lang/crates.io-index"
-checksum = "e0b1fab2ae45819af2d0731d60f2afe17227ebb1a1538a236da84c93e9a60162"
-dependencies = [
- "dispatch2",
- "nix",
- "windows-sys 0.61.2",
-]
-
[[package]]
name = "curve25519-dalek"
version = "4.1.3"
@@ -1082,18 +1056,6 @@ dependencies = [
"windows-sys 0.61.2",
]
-[[package]]
-name = "dispatch2"
-version = "0.3.1"
-source = "registry+https://github.com/rust-lang/crates.io-index"
-checksum = "1e0e367e4e7da84520dedcac1901e4da967309406d1e51017ae1abfb97adbd38"
-dependencies = [
- "bitflags 2.11.0",
- "block2",
- "libc",
- "objc2",
-]
-
[[package]]
name = "displaydoc"
version = "0.2.5"
@@ -2210,18 +2172,6 @@ dependencies = [
"tempfile",
]
-[[package]]
-name = "nix"
-version = "0.31.2"
-source = "registry+https://github.com/rust-lang/crates.io-index"
-checksum = "5d6d0705320c1e6ba1d912b5e37cf18071b6c2e9b7fa8215a1e8a7651966f5d3"
-dependencies = [
- "bitflags 2.11.0",
- "cfg-if",
- "cfg_aliases",
- "libc",
-]
-
[[package]]
name = "nom"
version = "7.1.3"
@@ -2366,15 +2316,6 @@ dependencies = [
"syn 2.0.117",
]
-[[package]]
-name = "objc2"
-version = "0.6.4"
-source = "registry+https://github.com/rust-lang/crates.io-index"
-checksum = "3a12a8ed07aefc768292f076dc3ac8c48f3781c8f2d5851dd3d98950e8c5a89f"
-dependencies = [
- "objc2-encode",
-]
-
[[package]]
name = "objc2-core-foundation"
version = "0.3.2"
@@ -2384,12 +2325,6 @@ dependencies = [
"bitflags 2.11.0",
]
-[[package]]
-name = "objc2-encode"
-version = "4.1.0"
-source = "registry+https://github.com/rust-lang/crates.io-index"
-checksum = "ef25abbcd74fb2609453eb695bd2f860d389e457f67dc17cafc8b8cbc89d0c33"
-
[[package]]
name = "objc2-io-kit"
version = "0.3.2"
@@ -3123,12 +3058,14 @@ dependencies = [
"arti-client",
"chrono",
"crossterm",
- "ctrlc",
"data-encoding",
"futures",
+ "libc",
"log",
"serde",
"sha3",
+ "tempfile",
+ "thiserror 1.0.69",
"tokio",
"toml 0.8.23",
"tor-cell",
@@ -3864,7 +3801,6 @@ dependencies = [
"bytes",
"libc",
"mio 1.1.1",
- "parking_lot",
"pin-project-lite",
"signal-hook-registry",
"socket2",
diff --git a/Cargo.toml b/Cargo.toml
index 7e01afa..0180356 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -10,25 +10,41 @@ description = "Single-binary, zero-setup static site hosting appliance with Tor
license = "MIT"
authors = []
+[lib]
+name = "rusthost"
+path = "src/lib.rs"
+
[[bin]]
name = "rusthost-cli"
path = "src/main.rs"
-[dependencies]
-# Async runtime — drives the server, console, and all I/O
-tokio = { version = "1", features = ["full"] }
-
-# ─── Arti (in-process Tor) ────────────────────────────────────────────────────
-#
-# arti-client — the high-level Tor client API (bootstrap, launch onion service)
-# tokio — use Tokio as the async runtime (default, required)
-# native-tls — TLS backend for connecting to Tor relays (default)
-# onion-service-service — enables hosting (serving) onion services
+# ─── Lint configuration ───────────────────────────────────────────────────────
#
-# tor-hsservice — lower-level onion service types:
-# handle_rend_requests, OnionServiceConfigBuilder, StreamRequest
-#
-# futures — StreamExt trait for iterating over the stream of StreamRequests
+# clippy::all — every lint in the default set (correctness + style + perf)
+# clippy::pedantic — stricter style + API lints; individual allows used where
+# the rule conflicts with the module's documented design
+# (e.g. too_many_arguments for the HTTP write_* stack which
+# must mirror the HTTP/1.1 wire format).
+[lints.rust]
+unsafe_code = "forbid"
+
+[lints.clippy]
+all = { level = "deny", priority = -1 }
+pedantic = { level = "deny", priority = -1 }
+# nursery lints warn but do not gate CI; they surface improvement candidates.
+nursery = { level = "warn", priority = -1 }
+
+[dependencies]
+tokio = { version = "1", features = [
+ "rt-multi-thread",
+ "net",
+ "io-util",
+ "fs",
+ "sync",
+ "time",
+ "macros",
+ "signal",
+] }
arti-client = { version = "0.40", features = [
"tokio",
@@ -36,31 +52,23 @@ arti-client = { version = "0.40", features = [
"onion-service-service",
] }
tor-hsservice = { version = "0.40" }
-# tor-cell: needed to construct the Connected message passed to StreamRequest::accept()
-tor-cell = { version = "0.40" }
-futures = "0.3"
+tor-cell = { version = "0.40" }
+futures = "0.3"
-# Onion-address encoding (used by the tor module to format HsId → ${base32}.onion)
-# sha3: provides SHA3-256 for the v3 address checksum
-# data-encoding: provides RFC 4648 base32 (no-padding, uppercase/lowercase)
-sha3 = "0.10"
+sha3 = "0.10"
data-encoding = "2"
+thiserror = "1"
+serde = { version = "1", features = ["derive"] }
+toml = "0.8"
+log = "0.4"
+crossterm = "0.27"
+chrono = { version = "0.4", features = ["clock"] }
+# OS error codes used in the accept-loop backoff to distinguish EMFILE/ENFILE
+# (resource exhaustion → log error) from transient errors (log debug).
+libc = "0.2"
-# Configuration — TOML parsing and typed deserialization
-serde = { version = "1", features = ["derive"] }
-toml = "0.8"
-
-# Logging — standard facade used by all modules
-log = "0.4"
-
-# Terminal — cross-platform raw mode, cursor control, key events
-crossterm = "0.27"
-
-# Timestamps — used in log formatting
-chrono = { version = "0.4", features = ["clock"] }
-
-# Signal handling — catches SIGINT / SIGTERM for graceful shutdown
-ctrlc = "3"
+[dev-dependencies]
+tempfile = "3"
[profile.release]
opt-level = 3
diff --git a/README.md b/README.md
index fba9d64..3f27949 100644
--- a/README.md
+++ b/README.md
@@ -54,11 +54,19 @@ Drop the binary next to your site files, run it once, and you get:
### 🌐 HTTP Server
- Built directly on `tokio::net::TcpListener` — no HTTP framework dependency
- Handles `GET` and `HEAD` requests; concurrent connections via per-task Tokio workers
-- Percent-decoded URL paths, query string & fragment stripping
-- **Path traversal protection** — every path verified as a descendant of the site root via `canonicalize`; escapes rejected with `403 Forbidden`
-- Configurable index file, optional HTML directory listings, and a built-in fallback page
+- **Buffered request reading** via `tokio::io::BufReader` — headers read line-by-line, not byte-by-byte
+- **File streaming** via `tokio::io::copy` — memory per connection is bounded by the socket buffer (~256 KB) regardless of file size
+- **30-second request timeout** (configurable via `request_timeout_secs`); slow or idle connections receive `408 Request Timeout`
+- **Semaphore-based connection limit** (configurable via `max_connections`, default 256) — excess connections queue at the OS backlog level rather than spawning unbounded tasks
+- Percent-decoded URL paths with correct multi-byte UTF-8 handling; null bytes (`%00`) are never decoded
+- Query string & fragment stripping before path resolution
+- **Path traversal protection** — every path verified as a descendant of the site root via `canonicalize` (called once at startup, not per request); escapes rejected with `403 Forbidden`
+- Configurable index file, optional HTML directory listing with fully HTML-escaped and URL-encoded filenames, and a built-in fallback page
- Automatic port selection if the configured port is busy (up to 10 attempts)
- Request header cap at 8 KiB; `Content-Type`, `Content-Length`, and `Connection: close` on every response
+- **Security headers on every response**: `X-Content-Type-Options`, `X-Frame-Options`, `Referrer-Policy: no-referrer`, `Permissions-Policy`; configurable `Content-Security-Policy` on HTML responses
+- **HEAD responses** include correct `Content-Length` but no body, as required by RFC 7231 §4.3.2
+- Accept loop uses **exponential backoff** on errors and distinguishes `EMFILE` (operator-level error) from transient errors (`ECONNRESET`, `ECONNABORTED`)
### 🧅 Tor Onion Service *(fully working)*
- Embedded via [Arti](https://gitlab.torproject.org/tpo/core/arti) — the official Rust Tor client — in-process, no external daemon
@@ -67,6 +75,9 @@ Drop the binary next to your site files, run it once, and you get:
- First run fetches ~2 MB of directory data (~30 s); subsequent starts reuse the cache and are up in seconds
- Onion address computed fully in-process using the v3 spec (SHA3-256 + base32)
- Each inbound Tor connection is bridged to the local HTTP listener via `tokio::io::copy_bidirectional`
+- **Port synchronised via `oneshot` channel** — the Tor subsystem always receives the actual bound port, eliminating a race condition that could cause silent connection failures
+- **`TorStatus` reflects mid-session failures** — if the onion service stream terminates unexpectedly, the dashboard transitions to `FAILED (reason)` and clears the displayed `.onion` address
+- Participates in **graceful shutdown** — the run loop watches the shutdown signal via `tokio::select!` and exits cleanly
- Can be disabled entirely with `[tor] enabled = false`
### 🖥️ Interactive Terminal Dashboard
@@ -81,18 +92,32 @@ Drop the binary next to your site files, run it once, and you get:
| `R` | Reload site file count & size without restart |
| `Q` | Graceful shutdown |
+- **Skip-on-idle rendering** — the terminal is only written when the rendered output changes, eliminating unnecessary writes on quiet servers
+- `TorStatus::Failed` displays a human-readable reason string (e.g. `FAILED (stream ended)`) rather than a bare error indicator
+- Keyboard input task failure is detected and reported; the process remains killable via Ctrl-C
+- **Terminal fully restored on all exit paths** — panic hook and error handler both call `console::cleanup()` before exiting, ensuring `LeaveAlternateScreen`, `cursor::Show`, and `disable_raw_mode` always run
- Configurable refresh rate (default 500 ms); headless mode available for `systemd` / piped deployments
### ⚙️ Configuration
- TOML file at `rusthost-data/settings.toml`, auto-generated with inline comments on first run
- Six sections: `[server]`, `[site]`, `[tor]`, `[logging]`, `[console]`, `[identity]`
+- **`#[serde(deny_unknown_fields)]`** on all structs — typos in key names are rejected at startup with a clear error
+- **Typed config fields** — `bind` is `IpAddr`, `log level` is a `LogLevel` enum; invalid values are caught at deserialisation time
- Startup validation with clear, multi-error messages — nothing starts until config is clean
+- Config and data directory paths overridable via **`--config `** and **`--data-dir `** CLI flags
### 📝 Logging
- Custom `log::Log` implementation; dual output — append-mode log file + in-memory ring buffer (1 000 lines)
- Ring buffer feeds the dashboard log view with zero file I/O per render tick
+- **Dependency log filtering** — Arti and Tokio internals at `Info` and below are suppressed by default, keeping the log focused on application events (configurable via `filter_dependencies`)
+- Log file explicitly flushed on graceful shutdown
- Configurable level (`trace` → `error`) and optional full disable for minimal-overhead deployments
+### 🧪 Testing & CI
+- Unit tests for all security-critical functions: `percent_decode`, `resolve_path`, `validate`, `strip_timestamp`, `hsid_to_onion_address`
+- Integration tests (`tests/http_integration.rs`) covering all HTTP core flows via raw `TcpStream`
+- `cargo deny check` runs in CI, enforcing the SPDX license allowlist and advisory database; `audit.toml` consolidated into `deny.toml`
+
---
## Quick Start
@@ -134,37 +159,53 @@ rusthost-data/
The dashboard appears. Your site is live on `http://localhost:8080`. Tor bootstraps in the background — your `.onion` address appears in the **Endpoints** panel once ready (~30 s on first run).
+### CLI flags
+
+```
+rusthost [OPTIONS]
+
+Options:
+ --config Path to settings.toml (default: rusthost-data/settings.toml)
+ --data-dir Path to data directory (default: rusthost-data/ next to binary)
+ --version Print version and exit
+ --help Print this help and exit
+```
+
---
## Configuration Reference
```toml
[server]
-port = 8080
-bind = "127.0.0.1" # set "0.0.0.0" to expose on LAN (logs a warning)
-index_file = "index.html"
-directory_listing = false
-auto_port_fallback = true
+port = 8080
+bind = "127.0.0.1" # set "0.0.0.0" to expose on LAN (logs a warning)
+index_file = "index.html"
+directory_listing = false
+auto_port_fallback = true
+max_connections = 256 # semaphore cap on concurrent connections
+request_timeout_secs = 30 # seconds before idle connection receives 408
+content_security_policy = "default-src 'self'" # applied to HTML responses only
[site]
root = "rusthost-data/site"
[tor]
-enabled = true # set false to skip Tor entirely
+enabled = true # set false to skip Tor entirely
[logging]
-enabled = true
-level = "info" # trace | debug | info | warn | error
-path = "logs/rusthost.log"
+enabled = true
+level = "info" # trace | debug | info | warn | error
+path = "logs/rusthost.log"
+filter_dependencies = true # suppress Arti/Tokio noise at info and below
[console]
-interactive = true # false for systemd / piped deployments
-refresh_ms = 500 # minimum 100
-show_timestamps = false
+interactive = true # false for systemd / piped deployments
+refresh_ms = 500 # minimum 100
+show_timestamps = false
open_browser_on_start = false
[identity]
-name = "RustHost" # 1–32 chars, shown in dashboard header
+name = "RustHost" # 1–32 chars, shown in dashboard header
```
---
@@ -205,7 +246,9 @@ Unknown extensions fall back to `application/octet-stream`.
All subsystems share state through `Arc>`. Hot-path request and error counters use a separate `Arc` backed by atomics — the HTTP handler **never acquires a lock per request**.
-Shutdown is coordinated via a `watch` channel: `[Q]`, `SIGINT`, or `SIGTERM` signals all subsystems simultaneously, waits 300 ms for in-flight connections, then exits. The Tor client is dropped naturally with the Tokio runtime — no explicit kill step needed.
+The HTTP server and Tor subsystem share a `tokio::sync::Semaphore` that caps concurrent connections. The bound port is communicated to Tor via a `oneshot` channel before the accept loop begins, eliminating the startup race condition present in earlier versions.
+
+Shutdown is coordinated via a `watch` channel: `[Q]`, `SIGINT`, or `SIGTERM` signals all subsystems simultaneously. In-flight HTTP connections are tracked in a `JoinSet` and given up to 5 seconds to complete. The log file is explicitly flushed before the process exits.
---
@@ -213,11 +256,23 @@ Shutdown is coordinated via a `watch` channel: `[Q]`, `SIGINT`, or `SIGTERM` sig
| Concern | Mitigation |
|---------|-----------|
-| Path traversal | `std::fs::canonicalize` + descendant check; returns `403` on escape |
+| Path traversal (requests) | `std::fs::canonicalize` + descendant check per request; `403` on escape |
+| Path traversal (config) | `site.directory` and `logging.file` validated against `..`, absolute paths, and path separators at startup |
+| Directory listing XSS | Filenames HTML-entity-escaped in link text; percent-encoded in `href` attributes |
| Header overflow | 8 KiB hard cap; oversized requests rejected immediately |
+| Slow-loris DoS | 30-second request timeout; `408` sent on expiry |
+| Connection exhaustion | Semaphore cap (default 256); excess connections queue at OS level |
+| Memory exhaustion (large files) | Files streamed via `tokio::io::copy`; per-connection memory bounded by socket buffer |
| Bind exposure | Defaults to loopback (`127.0.0.1`); warns loudly on `0.0.0.0` |
+| ANSI/terminal injection | `instance_name` validated against all control characters (`is_control`) at startup |
+| Security response headers | `X-Content-Type-Options`, `X-Frame-Options`, `Referrer-Policy: no-referrer`, `Permissions-Policy`, configurable `Content-Security-Policy` |
+| `.onion` URL leakage | `Referrer-Policy: no-referrer` prevents the `.onion` address from appearing in `Referer` headers sent to third-party resources |
+| Tor port race | Bound port delivered to Tor via `oneshot` channel before accept loop starts |
+| Silent Tor failure | `TorStatus` transitions to `Failed(reason)` and onion address is cleared when the service stream ends |
+| Percent-decode correctness | Multi-byte UTF-8 sequences decoded correctly; null bytes (`%00`) never decoded |
+| Config typos | `#[serde(deny_unknown_fields)]` on all structs |
| License compliance | `cargo-deny` enforces SPDX allowlist at CI time |
-| [RUSTSEC-2023-0071](https://rustsec.org/advisories/RUSTSEC-2023-0071) | Suppressed with rationale: the `rsa` crate is a transitive dep of `arti-client` used **only** for signature *verification* on Tor directory documents — the Marvin timing attack's threat model (decryption oracle) does not apply |
+| [RUSTSEC-2023-0071](https://rustsec.org/advisories/RUSTSEC-2023-0071) | Suppressed with rationale in `deny.toml`: the `rsa` crate is a transitive dep of `arti-client` used **only** for signature *verification* on Tor directory documents — the Marvin timing attack's threat model (decryption oracle) does not apply |
---
diff --git a/deny.toml b/deny.toml
index daa8e54..f4bf2df 100644
--- a/deny.toml
+++ b/deny.toml
@@ -77,6 +77,24 @@ skip = [
{ name = "windows_x86_64_gnullvm" },
{ name = "windows_x86_64_msvc" },
{ name = "winnow" },
+ # 4.6 — Five additional duplicates present in Cargo.lock but previously
+ # absent from this list, causing `cargo deny check` to warn on every run.
+ #
+ # foldhash 0.1.x / 0.2.x — two generations pulled by hashbrown 0.15 vs 0.16
+ # (both via arti's dependency tree; different sub-crates pin different gens)
+ { name = "foldhash" },
+ # hashbrown 0.12 / 0.15 / 0.16 — pulled by indexmap 1.x (0.12), tokio/arti
+ # (0.15), and the latest arti-client dependency updates (0.16)
+ { name = "hashbrown" },
+ # indexmap 1.x / 2.x — arti-client 0.40 still carries several crates that
+ # depend on indexmap 1.x while the rest of the tree has migrated to 2.x
+ { name = "indexmap" },
+ # redox_syscall 0.5.x / 0.7.x — pulled by crossterm (0.5) and tokio (0.7)
+ # via their respective platform abstraction layers for Redox OS
+ { name = "redox_syscall" },
+ # schemars 0.9.x / 1.x — arti-client 0.40 uses schemars 0.9 for config
+ # schema generation; our direct deps have moved to 1.x
+ { name = "schemars" },
]
# ─── Advisories ───────────────────────────────────────────────────────────────
diff --git a/src/.DS_Store b/src/.DS_Store
deleted file mode 100644
index 4065707..0000000
Binary files a/src/.DS_Store and /dev/null differ
diff --git a/src/config/defaults.rs b/src/config/defaults.rs
index dda25ab..5d8aa50 100644
--- a/src/config/defaults.rs
+++ b/src/config/defaults.rs
@@ -27,6 +27,19 @@ auto_port_fallback = true
# Open the system default browser at http://localhost: on startup.
open_browser_on_start = false
+# Maximum number of concurrent HTTP connections. Excess connections queue
+# at the OS TCP backlog level rather than spawning unbounded tasks.
+max_connections = 256
+
+# Content-Security-Policy value sent with every HTML response.
+# The default allows same-origin resources plus inline scripts and styles,
+# which is required for onclick handlers,
- Index of {url_path}
+ Index of {escaped_path}
@@ -232,29 +586,253 @@ fn build_directory_listing(dir: &Path, url_path: &str) -> String {
)
}
+/// HTML-entity-escape a string for safe insertion into HTML content or
+/// attribute values.
+fn html_escape(s: &str) -> String {
+ let mut out = String::with_capacity(s.len());
+ for ch in s.chars() {
+ match ch {
+ '&' => out.push_str("&"),
+ '<' => out.push_str("<"),
+ '>' => out.push_str(">"),
+ '"' => out.push_str("""),
+ '\'' => out.push_str("'"),
+ c => out.push(c),
+ }
+ }
+ out
+}
+
+/// Percent-encode a filename component for safe use in a URL path segment.
+///
+/// Encodes all bytes that are not unreserved URI characters (RFC 3986).
+fn percent_encode_path(s: &str) -> String {
+ let mut out = String::with_capacity(s.len());
+ for byte in s.bytes() {
+ match byte {
+ // Unreserved characters: ALPHA / DIGIT / "-" / "." / "_" / "~"
+ b'A'..=b'Z' | b'a'..=b'z' | b'0'..=b'9' | b'-' | b'.' | b'_' | b'~' => {
+ // All matched bytes are ASCII; `char::from` is the
+ // clippy-pedantic-clean alternative to `byte as char`.
+ out.push(char::from(byte));
+ }
+ b => {
+ let _ = write!(out, "%{b:02X}");
+ }
+ }
+ }
+ out
+}
+
// ─── Percent decoding ────────────────────────────────────────────────────────
/// Decode percent-encoded characters in a URL path (e.g. `%20` → ` `).
-fn percent_decode(input: &str) -> String {
+///
+/// # Correctness (fix 4.5)
+///
+/// Accumulates consecutive percent-decoded bytes into a buffer and converts to
+/// UTF-8 via `String::from_utf8_lossy` only when a literal character (or
+/// end-of-input) breaks the run. This correctly handles multi-byte sequences
+/// split across adjacent `%XX` tokens (e.g. `%C3%A9` → `é`).
+///
+/// Null bytes (`%00`) are never decoded — they are passed through as the
+/// literal string `%00` to prevent null-byte path injection attacks.
+#[must_use]
+pub(crate) fn percent_decode(input: &str) -> String {
let mut output = String::with_capacity(input.len());
- let mut chars = input.chars();
-
- while let Some(c) = chars.next() {
- if c != '%' {
- output.push(c);
- continue;
- }
- // Decode the next two hex digits.
- let h1 = chars.next().and_then(|c| c.to_digit(16));
- let h2 = chars.next().and_then(|c| c.to_digit(16));
- if let (Some(a), Some(b)) = (h1, h2) {
- // Both digits are valid 0–15, so the combined value fits in u8.
- let byte = u8::try_from((a << 4) | b).unwrap_or(b'?');
- output.push(byte as char);
+ // Buffer for consecutive percent-decoded bytes that may form a multi-byte
+ // UTF-8 character together.
+ let mut byte_buf: Vec = Vec::new();
+
+ let src = input.as_bytes();
+ let mut i = 0;
+
+ while i < src.len() {
+ if src.get(i).copied() == Some(b'%') {
+ let h1 = src.get(i.saturating_add(1)).copied().and_then(hex_digit);
+ let h2 = src.get(i.saturating_add(2)).copied().and_then(hex_digit);
+ if let (Some(hi), Some(lo)) = (h1, h2) {
+ let byte = (hi << 4) | lo;
+ if byte == 0x00 {
+ // 4.5 — null byte: do not decode, emit literal %00.
+ flush_byte_buf(&mut byte_buf, &mut output);
+ output.push_str("%00");
+ } else {
+ byte_buf.push(byte);
+ }
+ i = i.saturating_add(3);
+ } else {
+ // Incomplete or invalid %XX — pass through literal `%`.
+ flush_byte_buf(&mut byte_buf, &mut output);
+ output.push('%');
+ i = i.saturating_add(1);
+ }
} else {
- output.push('%');
+ flush_byte_buf(&mut byte_buf, &mut output);
+ // Advance by one full UTF-8 character so we never split a scalar.
+ let ch = input
+ .get(i..)
+ .and_then(|s| s.chars().next())
+ .unwrap_or('\u{FFFD}');
+ output.push(ch);
+ i = i.saturating_add(ch.len_utf8());
}
}
-
+ // Flush any trailing percent-decoded bytes at end-of-input.
+ flush_byte_buf(&mut byte_buf, &mut output);
output
}
+
+/// Convert a single ASCII hex digit byte to its numeric value, or `None`.
+const fn hex_digit(b: u8) -> Option {
+ match b {
+ b'0'..=b'9' => Some(b.wrapping_sub(b'0')),
+ b'a'..=b'f' => Some(b.wrapping_sub(b'a').wrapping_add(10)),
+ b'A'..=b'F' => Some(b.wrapping_sub(b'A').wrapping_add(10)),
+ _ => None,
+ }
+}
+
+/// Interpret `buf` as UTF-8 (with lossy replacement for invalid sequences),
+/// append to `out`, then clear `buf`.
+fn flush_byte_buf(buf: &mut Vec, out: &mut String) {
+ if !buf.is_empty() {
+ out.push_str(&String::from_utf8_lossy(buf));
+ buf.clear();
+ }
+}
+
+// ─── Unit tests ───────────────────────────────────────────────────────────────
+
+#[cfg(test)]
+mod tests {
+ // `expect()` in test helpers is idiomatic and intentional — a failure here
+ // means the test environment itself is broken, not the code under test.
+ #![allow(clippy::expect_used)]
+
+ use std::path::Path;
+
+ use super::{percent_decode, resolve_path, Resolved};
+
+ // ── percent_decode ────────────────────────────────────────────────────────
+
+ #[test]
+ fn percent_decode_ascii_passthrough() {
+ assert_eq!(percent_decode("/index.html"), "/index.html");
+ }
+
+ #[test]
+ fn percent_decode_space() {
+ assert_eq!(percent_decode("/file%20name.html"), "/file name.html");
+ }
+
+ #[test]
+ fn percent_decode_multibyte_utf8() {
+ // %C3%A9 is the UTF-8 encoding of 'é' (U+00E9).
+ // Regression test for fix 4.5: the old implementation decoded each
+ // %XX pair as an independent u8→char cast, yielding "é" instead of "é".
+ assert_eq!(percent_decode("/caf%C3%A9.html"), "/café.html");
+ }
+
+ #[test]
+ fn percent_decode_null_byte_not_decoded() {
+ // %00 must never be decoded to a null byte (path injection attack).
+ // The literal string "%00" must appear in the output unchanged.
+ let result = percent_decode("/foo%00/../secret");
+ assert!(
+ !result.contains('\x00'),
+ "null byte found in decoded output: {result:?}"
+ );
+ assert!(
+ result.contains("%00"),
+ "expected literal %00 in output, got: {result:?}"
+ );
+ }
+
+ #[test]
+ fn percent_decode_incomplete_percent_sequence() {
+ // "/foo%2" — the `%2` is not followed by a second hex digit, so the
+ // `%` is passed through literally and the `2` is re-processed.
+ assert_eq!(percent_decode("/foo%2"), "/foo%2");
+ }
+
+ #[test]
+ fn percent_decode_invalid_hex() {
+ // "%ZZ" contains non-hex digits after `%`; output must be unchanged.
+ assert_eq!(percent_decode("/foo%ZZ"), "/foo%ZZ");
+ }
+
+ // ── resolve_path ──────────────────────────────────────────────────────────
+ //
+ // All tests that exercise the file-system use a temporary directory so
+ // they are completely self-contained and leave no side effects.
+
+ /// Returns a canonical temp dir with the structure:
+ /// ```
+ /// /
+ /// root/
+ /// index.html ← served for happy-path tests
+ /// secret.txt ← outside root, for traversal tests
+ /// ```
+ fn make_test_tree() -> (tempfile::TempDir, std::path::PathBuf) {
+ let tmp = tempfile::tempdir().expect("tempdir");
+ let root = tmp.path().join("root");
+ std::fs::create_dir_all(&root).expect("create root");
+ std::fs::write(root.join("index.html"), b"hello").expect("write index");
+ std::fs::write(tmp.path().join("secret.txt"), b"secret").expect("write secret");
+ let canonical_root = root.canonicalize().expect("canonicalize root");
+ (tmp, canonical_root)
+ }
+
+ #[test]
+ fn resolve_path_happy_path() {
+ let (_tmp, root) = make_test_tree();
+ let result = resolve_path(&root, "/index.html", "index.html", false);
+ assert!(
+ matches!(result, Resolved::File(_)),
+ "expected Resolved::File, got {result:?}"
+ );
+ }
+
+ #[test]
+ fn resolve_path_directory_traversal() {
+ let (tmp, root) = make_test_tree();
+ // secret.txt lives one level above `root`, so "/../secret.txt" would
+ // escape the root if the traversal check were absent.
+ // canonicalize() resolves `/../secret.txt` → `/secret.txt`
+ // which is a real file, but it does NOT start_with `root` → Forbidden.
+ let _ = tmp; // keep alive so secret.txt exists for canonicalize
+ let result = resolve_path(&root, "/../secret.txt", "index.html", false);
+ assert_eq!(
+ result,
+ Resolved::Forbidden,
+ "expected Resolved::Forbidden for traversal attempt"
+ );
+ }
+
+ #[test]
+ fn resolve_path_encoded_slash_traversal() {
+ // After percent-decoding, "/..%2Fsecret.txt" becomes "/../secret.txt"
+ // which is what is passed to resolve_path — same traversal as above.
+ let (tmp, root) = make_test_tree();
+ let decoded = super::percent_decode("/../secret.txt"); // already decoded form
+ let _ = tmp;
+ let result = resolve_path(&root, &decoded, "index.html", false);
+ assert_eq!(result, Resolved::Forbidden);
+ }
+
+ #[test]
+ fn resolve_path_missing_file_returns_not_found() {
+ let (_tmp, root) = make_test_tree();
+ let result = resolve_path(&root, "/does_not_exist.txt", "index.html", false);
+ assert_eq!(result, Resolved::NotFound);
+ }
+
+ #[test]
+ fn resolve_path_missing_root_returns_fallback() {
+ // Passing a non-existent root means every canonicalize() call fails.
+ let missing_root = Path::new("/nonexistent/root/that/does/not/exist");
+ let result = resolve_path(missing_root, "/index.html", "index.html", false);
+ assert_eq!(result, Resolved::Fallback);
+ }
+}
diff --git a/src/server/mime.rs b/src/server/mime.rs
index 894c266..2a9c12b 100644
--- a/src/server/mime.rs
+++ b/src/server/mime.rs
@@ -10,11 +10,28 @@
///
/// # Examples
/// ```
+/// use rusthost::server::mime;
/// assert_eq!(mime::for_extension("html"), "text/html; charset=utf-8");
/// assert_eq!(mime::for_extension("xyz"), "application/octet-stream");
/// ```
+#[must_use]
pub fn for_extension(ext: &str) -> &'static str {
- match ext.to_ascii_lowercase().as_str() {
+ // 3.4 — Normalise to lowercase in a fixed stack buffer to avoid a heap
+ // allocation on every served file request. Extensions longer than 16 bytes
+ // are not in the table, so we short-circuit to the fallback immediately.
+ let bytes = ext.as_bytes();
+ let mut buf = [0u8; 16];
+ if bytes.len() > buf.len() {
+ return "application/octet-stream";
+ }
+ // Use zip to avoid any index that could theoretically panic under clippy's
+ // indexing_slicing lint; the length guard above already guarantees safety.
+ for (slot, &b) in buf.iter_mut().zip(bytes.iter()) {
+ *slot = b.to_ascii_lowercase();
+ }
+ // get() instead of a bare slice to satisfy clippy::indexing_slicing.
+ let lower = std::str::from_utf8(buf.get(..bytes.len()).unwrap_or_default()).unwrap_or("");
+ match lower {
// Text
"html" | "htm" => "text/html; charset=utf-8",
"css" => "text/css; charset=utf-8",
diff --git a/src/server/mod.rs b/src/server/mod.rs
index 46a3ebc..5400ef8 100644
--- a/src/server/mod.rs
+++ b/src/server/mod.rs
@@ -14,37 +14,65 @@ pub mod fallback;
pub mod handler;
pub mod mime;
-use std::{net::TcpListener as StdTcpListener, path::Path, path::PathBuf, sync::Arc};
+use std::{
+ net::{IpAddr, TcpListener as StdTcpListener},
+ path::{Path, PathBuf},
+ sync::Arc,
+ time::Duration,
+};
-use tokio::{net::TcpListener, sync::watch};
+use tokio::{
+ net::TcpListener,
+ sync::{oneshot, watch, Semaphore},
+ task::JoinSet,
+};
use crate::{
config::Config,
runtime::state::{SharedMetrics, SharedState},
- Result,
+ AppError, Result,
};
-// ─── Public API ─────────────────────────────────────────────────────────────
+// ─── Public API ──────────────────────────────────────────────────────────────
/// Start the HTTP server.
///
/// Binds the port (with optional fallback), updates `SharedState.actual_port`,
+/// sends the bound port through `port_tx` so Tor can start without a sleep,
/// then accepts connections until the shutdown watch fires.
+///
+/// ## Accept-loop observability (task 5.4)
+///
+/// Accept errors use exponential backoff (1 ms → 1 s) to prevent log storms
+/// under persistent failures such as `EMFILE`. Error severity is split:
+///
+/// - **`EMFILE` / `ENFILE`** (file-descriptor exhaustion) → logged at `error`;
+/// these require operator intervention.
+/// - **Transient errors** (`ECONNRESET`, `ECONNABORTED`, etc.) → logged at
+/// `debug`; they are expected under normal traffic and resolve automatically.
pub async fn run(
config: Arc,
state: SharedState,
metrics: SharedMetrics,
data_dir: PathBuf,
mut shutdown: watch::Receiver,
+ port_tx: oneshot::Sender,
) {
- let bind_addr = &config.server.bind;
- let base_port = config.server.port;
+ let bind_addr = config.server.bind;
+ // 4.2 — config.server.port is NonZeroU16; .get() produces the u16 value.
+ let base_port = config.server.port.get();
let fallback = config.server.auto_port_fallback;
+ // `u32 as usize`: usize ≥ 32 bits on every target Rust supports, so this
+ // conversion is lossless. The allow suppresses clippy::cast_possible_truncation.
+ #[allow(clippy::cast_possible_truncation)]
+ let max_conns = config.server.max_connections as usize;
let (listener, bound_port) = match bind_with_fallback(bind_addr, base_port, fallback) {
Ok(v) => v,
Err(e) => {
log::error!("Server failed to bind: {e}");
+ // port_tx is dropped here, which closes the channel; lifecycle
+ // will receive an Err from the oneshot receiver.
return;
}
};
@@ -59,30 +87,90 @@ pub async fn run(
s.server_running = true;
}
+ // Signal the bound port to lifecycle so Tor can start immediately.
+ let _ = port_tx.send(bound_port);
+
log::info!("HTTP server listening on {bind_addr}:{bound_port}");
let site_root = data_dir.join(&config.site.directory);
- let index_file = config.site.index_file.clone();
+ // 2.3 — canonicalize once here so resolve_path never calls canonicalize()
+ // per-request. If the root is missing or inaccessible, fail fast.
+ // 3.2 — Wrap in Arc so per-connection clones are O(1) refcount bumps.
+ let canonical_root: Arc = match site_root.canonicalize() {
+ Ok(p) => Arc::from(p.as_path()),
+ Err(e) => {
+ log::error!(
+ "Site root {} cannot be resolved: {e}. Check that [site] directory exists.",
+ site_root.display()
+ );
+ return;
+ }
+ };
+ // 3.2 — Arc: per-connection clone is an atomic refcount bump.
+ let index_file: Arc = Arc::from(config.site.index_file.as_str());
+ // 5.3 — Content-Security-Policy forwarded to every handler so it can be
+ // emitted on HTML responses without a global static.
+ let csp_header: Arc = Arc::from(config.server.content_security_policy.as_str());
let dir_list = config.site.enable_directory_listing;
+ let semaphore = Arc::new(Semaphore::new(max_conns));
+ // 2.10 — JoinSet tracks in-flight handler tasks so shutdown can drain them.
+ let mut join_set: JoinSet<()> = JoinSet::new();
+
+ // 5.4 — Exponential backoff on accept errors.
+ // Starts at 1 ms, doubles on each consecutive error, caps at 1 s.
+ // Reset to 1 ms after the next successful accept.
+ let mut backoff_ms: u64 = 1;
+
loop {
tokio::select! {
result = listener.accept() => {
match result {
Ok((stream, peer)) => {
+ // 5.4 — reset backoff after a successful accept.
+ backoff_ms = 1;
+
log::debug!("Connection from {peer}");
- let site = site_root.clone();
- let idx = index_file.clone();
+ let Ok(permit) = Arc::clone(&semaphore).acquire_owned().await else {
+ break; // semaphore closed — shutting down
+ };
+ if semaphore.available_permits() == 0 {
+ log::warn!(
+ "Connection limit ({max_conns}) reached; \
+ further connections will queue"
+ );
+ }
+ let site = Arc::clone(&canonical_root);
+ let idx = Arc::clone(&index_file);
let met = Arc::clone(&metrics);
- tokio::spawn(async move {
+ let csp = Arc::clone(&csp_header);
+ join_set.spawn(async move {
+ let _permit = permit;
if let Err(e) = handler::handle(
- stream, &site, &idx, dir_list, met
+ stream, site, idx, dir_list, met, csp,
).await {
log::debug!("Handler error: {e}");
}
});
}
- Err(e) => log::warn!("Accept error: {e}"),
+ Err(e) => {
+ // 5.4 — differentiate error severity.
+ if is_fd_exhaustion(&e) {
+ log::error!(
+ "Accept error — file-descriptor limit reached \
+ (EMFILE/ENFILE): {e}. Reduce max_connections or \
+ raise the OS ulimit."
+ );
+ } else {
+ log::debug!("Accept error (transient): {e}");
+ }
+
+ // 5.4 — exponential backoff: prevents log storms under
+ // persistent errors such as EMFILE (thousands of errors
+ // per second in a tight loop become at most one per 1 s).
+ tokio::time::sleep(Duration::from_millis(backoff_ms)).await;
+ backoff_ms = backoff_ms.saturating_mul(2).min(1_000);
+ }
}
}
@@ -93,14 +181,19 @@ pub async fn run(
}
state.write().await.server_running = false;
- log::info!("HTTP server stopped.");
+ log::info!("HTTP server stopped accepting; draining in-flight connections…");
+
+ // 2.10 — wait up to 5 seconds for in-flight handlers to complete.
+ let drain = async { while join_set.join_next().await.is_some() {} };
+ let _ = tokio::time::timeout(Duration::from_secs(5), drain).await;
+ log::info!("HTTP server drained.");
}
-// ─── Port binding ────────────────────────────────────────────────────────────
+// ─── Port binding ─────────────────────────────────────────────────────────────
/// Try to bind to `addr:port`. When `fallback` is true, increments the port
/// up to 10 times before giving up.
-fn bind_with_fallback(addr: &str, port: u16, fallback: bool) -> Result<(TcpListener, u16)> {
+fn bind_with_fallback(addr: IpAddr, port: u16, fallback: bool) -> Result<(TcpListener, u16)> {
let max_attempts: u16 = if fallback { 10 } else { 1 };
for attempt in 0..max_attempts {
@@ -116,41 +209,94 @@ fn bind_with_fallback(addr: &str, port: u16, fallback: bool) -> Result<(TcpListe
Err(e) if e.kind() == std::io::ErrorKind::AddrInUse && fallback => {
// Try the next port.
}
- Err(e) => {
- return Err(format!(
- "Port {try_port} is already in use. \
- Change [server].port in settings.toml or set \
- auto_port_fallback = true.\n OS error: {e}"
- )
- .into());
+ Err(source) => {
+ return Err(AppError::ServerBind {
+ port: try_port,
+ source,
+ });
}
}
}
- Err(format!(
- "Could not find a free port after {max_attempts} attempts \
- starting from {port}."
- )
- .into())
+ Err(AppError::ServerBind {
+ port,
+ source: std::io::Error::new(
+ std::io::ErrorKind::AddrInUse,
+ format!(
+ "Could not find a free port after {max_attempts} attempts \
+ starting from {port}. Change [server].port in settings.toml \
+ or set auto_port_fallback = true."
+ ),
+ ),
+ })
+}
+
+/// Return `true` when `e` represents file-descriptor exhaustion (`EMFILE` or
+/// `ENFILE`) on Unix platforms.
+///
+/// On non-Unix targets (Windows) where these error codes have no equivalent,
+/// always returns `false`.
+fn is_fd_exhaustion(e: &std::io::Error) -> bool {
+ #[cfg(unix)]
+ {
+ // EMFILE (24): too many open files for the process.
+ // ENFILE (23): too many open files system-wide.
+ // Both values are specified by POSIX and identical on Linux, macOS,
+ // FreeBSD, and other POSIX-conformant systems.
+ matches!(e.raw_os_error(), Some(libc::EMFILE | libc::ENFILE))
+ }
+ #[cfg(not(unix))]
+ {
+ let _ = e;
+ false
+ }
}
// ─── Site scanner ─────────────────────────────────────────────────────────────
-/// Count files and total bytes in the site directory (non-recursive).
-pub fn scan_site(site_root: &Path) -> (u32, u64) {
+/// Recursively count files and total bytes in `site_root` (BFS traversal).
+///
+/// Returns `Err` if any `read_dir` call fails so callers can log a warning
+/// instead of silently reporting zeros.
+///
+/// # Errors
+///
+/// Returns [`AppError::Io`] if any directory in the tree cannot be read.
+///
+/// # Panics
+///
+/// Does not panic. **Must be called from a blocking context** (e.g.
+/// `tokio::task::spawn_blocking`) because `std::fs::read_dir` is a blocking
+/// syscall.
+#[must_use = "the file count and byte total are used to populate the dashboard"]
+pub fn scan_site(site_root: &Path) -> crate::Result<(u32, u64)> {
let mut count = 0u32;
let mut bytes = 0u64;
- if let Ok(entries) = std::fs::read_dir(site_root) {
+ let mut queue: std::collections::VecDeque = std::collections::VecDeque::new();
+ queue.push_back(site_root.to_path_buf());
+
+ while let Some(dir) = queue.pop_front() {
+ let entries = std::fs::read_dir(&dir).map_err(|e| {
+ AppError::Io(std::io::Error::new(
+ e.kind(),
+ format!("Cannot read directory {}: {e}", dir.display()),
+ ))
+ })?;
+
for entry in entries.flatten() {
- if let Ok(meta) = entry.metadata() {
- if meta.is_file() {
+ match entry.metadata() {
+ Ok(m) if m.is_file() => {
count = count.saturating_add(1);
- bytes = bytes.saturating_add(meta.len());
+ bytes = bytes.saturating_add(m.len());
+ }
+ Ok(m) if m.is_dir() => {
+ queue.push_back(entry.path());
}
+ _ => {}
}
}
}
- (count, bytes)
+ Ok((count, bytes))
}
diff --git a/src/tor/mod.rs b/src/tor/mod.rs
index 948cfcd..ce13e81 100644
--- a/src/tor/mod.rs
+++ b/src/tor/mod.rs
@@ -37,7 +37,7 @@ use std::path::PathBuf;
use arti_client::config::TorClientConfigBuilder;
use arti_client::TorClient;
use futures::StreamExt;
-use tokio::net::TcpStream;
+use tokio::{net::TcpStream, sync::watch};
use tor_cell::relaycell::msg::Connected;
use tor_hsservice::{config::OnionServiceConfigBuilder, handle_rend_requests, HsId, StreamRequest};
@@ -50,23 +50,26 @@ use crate::runtime::state::{SharedState, TorStatus};
/// Spawns a Tokio task and returns immediately. Tor status and the onion
/// address are written into `state` as things progress, exactly as before.
///
-/// The signature is intentionally identical to the old subprocess version
-/// so `lifecycle.rs` requires zero changes.
-pub fn init(data_dir: PathBuf, bind_port: u16, state: SharedState) {
+/// `shutdown` is a watch channel whose `true` value triggers a clean exit
+/// from the stream-request loop (fix 2.10).
+pub fn init(
+ data_dir: PathBuf,
+ bind_port: u16,
+ state: SharedState,
+ shutdown: watch::Receiver,
+) {
tokio::spawn(async move {
- if let Err(e) = run(data_dir, bind_port, state.clone()).await {
+ if let Err(e) = run(data_dir, bind_port, state.clone(), shutdown).await {
log::error!("Tor: fatal error: {e}");
- set_status(&state, TorStatus::Failed(None)).await;
+ set_status(&state, TorStatus::Failed(e.to_string())).await;
}
});
}
-/// No-op on shutdown.
-///
-/// The `TorClient` is owned by the Tokio task spawned in `init()` and is
-/// dropped — closing all Tor circuits — when that task exits as part of the
-/// normal Tokio runtime shutdown. Nothing needs to be done explicitly here.
-pub const fn kill() {}
+// `kill()` has been removed (fix 2.10): the `TorClient` is owned by the task
+// spawned in `init()` and is dropped when that task exits, which closes all
+// Tor circuits cleanly. Graceful shutdown is now signalled through the
+// `shutdown` watch channel passed to `init()`.
// ─── Core async logic ─────────────────────────────────────────────────────────
@@ -74,6 +77,7 @@ async fn run(
data_dir: PathBuf,
bind_port: u16,
state: SharedState,
+ mut shutdown: watch::Receiver,
) -> Result<(), Box> {
set_status(&state, TorStatus::Starting).await;
@@ -158,17 +162,50 @@ async fn run(
// each other. Dropping the task naturally closes the Tor circuit.
let mut stream_requests = handle_rend_requests(rend_requests);
- while let Some(stream_req) = stream_requests.next().await {
- let local_addr = format!("127.0.0.1:{bind_port}");
- tokio::spawn(async move {
- if let Err(e) = proxy_stream(stream_req, &local_addr).await {
- // Downgraded to debug — normal on abrupt disconnects.
- log::debug!("Tor: stream closed: {e}");
+ let semaphore = std::sync::Arc::new(tokio::sync::Semaphore::new(256));
+
+ // 2.10 — use select! so a shutdown signal can break the accept loop cleanly,
+ // instead of blocking indefinitely in stream_requests.next().
+ loop {
+ tokio::select! {
+ next = stream_requests.next() => {
+ if let Some(stream_req) = next {
+ let local_addr = format!("127.0.0.1:{bind_port}");
+ let Ok(permit) = std::sync::Arc::clone(&semaphore).acquire_owned().await else {
+ break; // semaphore closed
+ };
+ tokio::spawn(async move {
+ let _permit = permit;
+ if let Err(e) = proxy_stream(stream_req, &local_addr).await {
+ // Downgraded to debug — normal on abrupt disconnects.
+ log::debug!("Tor: stream closed: {e}");
+ }
+ });
+ } else {
+ // The onion service stream ended unexpectedly (Tor network
+ // disruption, Arti internal error, resource exhaustion).
+ // Flip the dashboard to Failed so the operator sees a clear
+ // signal rather than a permanently green READY badge.
+ log::warn!(
+ "Tor: stream_requests stream ended — onion service is no longer active"
+ );
+ // 2.9 — use Failed(String) with a human-readable reason
+ set_status(&state, TorStatus::Failed("stream ended".into())).await;
+ state.write().await.onion_address = None;
+ return Ok(());
+ }
+ }
+ _ = shutdown.changed() => {
+ if *shutdown.borrow() {
+ log::info!("Tor: shutdown signal received — stopping stream loop");
+ break;
+ }
}
- });
+ }
}
- log::warn!("Tor: stream_requests stream ended — onion service is no longer active");
+ // Clean shutdown: clear the displayed onion address.
+ state.write().await.onion_address = None;
Ok(())
}
@@ -201,9 +238,14 @@ async fn proxy_stream(
/// Encode an `HsId` (ed25519 public key) as a v3 `.onion` domain name.
///
-/// `arti-client 0.40` exposes `HsId` via `DisplayRedacted` (from the `safelog`
-/// crate) rather than `std::fmt::Display`, so we cannot use `format!("{}", …)`
-/// directly. We implement the encoding ourselves using the spec:
+/// Delegates to [`onion_address_from_pubkey`] which is separately unit-tested.
+fn hsid_to_onion_address(hsid: HsId) -> String {
+ onion_address_from_pubkey(hsid.as_ref())
+}
+
+/// Encode a raw 32-byte ed25519 public key as a v3 `.onion` domain name.
+///
+/// Implements the encoding defined in the Tor Rendezvous Specification:
///
/// ```text
/// onion_address = base32(PUBKEY | CHECKSUM | VERSION) + ".onion"
@@ -211,11 +253,15 @@ async fn proxy_stream(
/// VERSION = 0x03
/// ```
///
-/// `HsId: AsRef<[u8; 32]>` is stable across arti 0.40+.
-fn hsid_to_onion_address(hsid: HsId) -> String {
+/// The output is always exactly 62 characters: 56 lowercase base32 characters
+/// followed by `".onion"`.
+///
+/// Separated from [`hsid_to_onion_address`] so that tests can supply an
+/// arbitrary 32-byte key without constructing an `HsId`.
+#[must_use]
+pub(crate) fn onion_address_from_pubkey(pubkey: &[u8; 32]) -> String {
use sha3::{Digest, Sha3_256};
- let pubkey: &[u8; 32] = hsid.as_ref();
let version: u8 = 3;
// CHECKSUM = SHA3-256(".onion checksum" || PUBKEY || VERSION) truncated to 2 bytes
@@ -230,14 +276,13 @@ fn hsid_to_onion_address(hsid: HsId) -> String {
address_bytes[..32].copy_from_slice(pubkey);
// Consume the first two checksum bytes via an iterator — clippy cannot
// prove at compile time that a GenericArray has >= 2 elements, so direct
- // indexing (hash[0], hash[1]) triggers `indexing_slicing`. SHA3-256
- // always produces 32 bytes, so next() will never return None here.
+ // indexing triggers `indexing_slicing`. SHA3-256 always produces 32 bytes.
let mut hash_iter = hash.iter().copied();
address_bytes[32] = hash_iter.next().unwrap_or(0);
address_bytes[33] = hash_iter.next().unwrap_or(0);
address_bytes[34] = version;
- // RFC 4648 base32, no padding, lowercase → 56 characters
+ // RFC 4648 base32, no padding, lowercase → 56 characters
let encoded = data_encoding::BASE32_NOPAD
.encode(&address_bytes)
.to_ascii_lowercase();
@@ -246,6 +291,9 @@ fn hsid_to_onion_address(hsid: HsId) -> String {
}
// ─── State helpers ────────────────────────────────────────────────────────────
+//
+// These must appear BEFORE the #[cfg(test)] module; items after a test module
+// trigger the `clippy::items_after_test_module` lint.
async fn set_status(state: &SharedState, status: TorStatus) {
state.write().await.tor_status = status;
@@ -256,3 +304,83 @@ async fn set_onion(state: &SharedState, addr: String) {
s.tor_status = TorStatus::Ready;
s.onion_address = Some(addr);
}
+
+// ─── Unit tests ───────────────────────────────────────────────────────────────
+
+#[cfg(test)]
+mod tests {
+ use super::onion_address_from_pubkey;
+
+ /// Compute the expected onion address for a given 32-byte key using the
+ /// same algorithm as `onion_address_from_pubkey`, acting as an independent
+ /// reference implementation to cross-check the production code.
+ fn reference_onion(pubkey: &[u8; 32]) -> String {
+ use data_encoding::BASE32_NOPAD;
+ use sha3::{Digest, Sha3_256};
+
+ let version: u8 = 3;
+ let mut hasher = Sha3_256::new();
+ hasher.update(b".onion checksum");
+ hasher.update(pubkey);
+ hasher.update([version]);
+ let hash = hasher.finalize();
+
+ let mut bytes = [0u8; 35];
+ bytes[..32].copy_from_slice(pubkey);
+ // Use iterator instead of direct indexing to avoid clippy::indexing_slicing.
+ // SHA3-256 always produces 32 bytes, so next() will never return None.
+ let mut it = hash.iter().copied();
+ bytes[32] = it.next().unwrap_or(0);
+ bytes[33] = it.next().unwrap_or(0);
+ bytes[34] = version;
+
+ format!("{}.onion", BASE32_NOPAD.encode(&bytes).to_ascii_lowercase())
+ }
+
+ #[test]
+ fn hsid_to_onion_address_all_zeros_vector() {
+ // Fixed 32-byte test vector: all zeros.
+ // The expected value is derived from the reference implementation above.
+ let pubkey = [0u8; 32];
+ let expected = reference_onion(&pubkey);
+ let actual = onion_address_from_pubkey(&pubkey);
+ assert_eq!(actual, expected);
+ }
+
+ #[test]
+ fn hsid_to_onion_address_format_is_correct() {
+ let pubkey = [0u8; 32];
+ let addr = onion_address_from_pubkey(&pubkey);
+ // A v3 onion address is always 56 base32 chars + ".onion" = 62 chars.
+ assert_eq!(addr.len(), 62, "unexpected length: {addr:?}");
+ // Use strip_suffix to avoid clippy::case_sensitive_file_extension_comparison.
+ assert!(
+ addr.strip_suffix(".onion").is_some(),
+ "must end with .onion: {addr:?}"
+ );
+ let host = addr.strip_suffix(".onion").unwrap_or(&addr);
+ assert!(
+ host.chars()
+ .all(|c| c.is_ascii_lowercase() || c.is_ascii_digit()),
+ "host contains non-base32 characters: {host:?}"
+ );
+ }
+
+ #[test]
+ fn hsid_to_onion_address_is_deterministic() {
+ // Calling the function twice with the same key must produce the same
+ // output — the address must be derivable from the public key alone.
+ let pubkey = [0x42u8; 32];
+ assert_eq!(
+ onion_address_from_pubkey(&pubkey),
+ onion_address_from_pubkey(&pubkey)
+ );
+ }
+
+ #[test]
+ fn hsid_to_onion_address_different_keys_produce_different_addresses() {
+ let a = onion_address_from_pubkey(&[0u8; 32]);
+ let b = onion_address_from_pubkey(&[1u8; 32]);
+ assert_ne!(a, b, "different keys must produce different addresses");
+ }
+}
diff --git a/tests/http_integration.rs b/tests/http_integration.rs
new file mode 100644
index 0000000..1e1c4fd
--- /dev/null
+++ b/tests/http_integration.rs
@@ -0,0 +1,406 @@
+//! # HTTP Server Integration Tests (task 5.2)
+//!
+//! Each test spins up an isolated [`rusthost::server::run`] instance, connects
+//! to it via [`tokio::net::TcpStream`], sends raw HTTP/1.1, and inspects the
+//! raw response bytes.
+//!
+//! ## Port allocation
+//!
+//! Each test calls [`free_port()`] which binds a `StdTcpListener` on
+//! `127.0.0.1:0`, reads the OS-assigned port, and immediately closes the
+//! listener. That port is then passed to the test config with
+//! `auto_port_fallback = false`, so the server binds the same (now-free) port.
+//! The TOCTOU window between release and server bind is acceptable on the
+//! loopback interface and eliminates the port-collision risk that would arise
+//! from having every test start from port 8080.
+
+use std::{net::SocketAddr, path::Path, sync::Arc, time::Duration};
+
+use tokio::{
+ io::{AsyncReadExt, AsyncWriteExt},
+ net::TcpStream,
+ sync::{watch, RwLock},
+};
+
+use rusthost::{
+ config::Config,
+ runtime::state::{AppState, Metrics},
+};
+
+// ─── Port helper ──────────────────────────────────────────────────────────────
+
+/// Ask the OS for a free port by binding on `0`, then release it.
+///
+/// Returns the port number for immediate use as the test server's bind port.
+/// `auto_port_fallback` is set to `false` in the test config so the server
+/// always binds exactly this port rather than searching a range.
+fn free_port() -> Result {
+ use std::net::TcpListener;
+ let listener = TcpListener::bind("127.0.0.1:0")?;
+ Ok(listener.local_addr()?.port())
+ // listener is dropped here, releasing the port
+}
+
+// ─── Test harness ─────────────────────────────────────────────────────────────
+
+/// A live server instance scoped to one test.
+struct TestServer {
+ addr: SocketAddr,
+ shutdown_tx: watch::Sender,
+ handle: Option>,
+}
+
+impl TestServer {
+ /// Spin up a server bound to the port returned by [`free_port()`].
+ ///
+ /// `site_root` must already contain the files the test expects to serve.
+ async fn start(site_root: &Path) -> Result> {
+ let port = free_port()?;
+ let config = Arc::new(build_test_config(site_root, port));
+ let state = Arc::new(RwLock::new(AppState::new()));
+ let metrics = Arc::new(Metrics::new());
+ let (shutdown_tx, shutdown_rx) = watch::channel(false);
+ let (port_tx, port_rx) = tokio::sync::oneshot::channel::();
+
+ // The server joins data_dir + config.site.directory to find files.
+ // `site_root` is `/site`; `data_dir` must therefore be ``.
+ let data_dir = site_root.parent().unwrap_or(site_root).to_path_buf();
+
+ let handle = {
+ let cfg = Arc::clone(&config);
+ let st = Arc::clone(&state);
+ let met = Arc::clone(&metrics);
+ let shut = shutdown_rx;
+ tokio::spawn(async move {
+ rusthost::server::run(cfg, st, met, data_dir, shut, port_tx).await;
+ })
+ };
+
+ // Wait for the server to confirm its bound port (5 s guard).
+ let bound_port = tokio::time::timeout(Duration::from_secs(5), port_rx).await??;
+
+ let addr: SocketAddr = format!("127.0.0.1:{bound_port}").parse()?;
+
+ Ok(Self {
+ addr,
+ shutdown_tx,
+ handle: Some(handle),
+ })
+ }
+
+ /// Send raw `request` bytes and return the complete response as a `String`.
+ ///
+ /// A 5-second read deadline prevents a misbehaving server from hanging the
+ /// test suite indefinitely.
+ async fn send(&self, request: &[u8]) -> Result> {
+ let mut stream = TcpStream::connect(self.addr).await?;
+ stream.write_all(request).await?;
+
+ let mut response = Vec::new();
+ tokio::time::timeout(Duration::from_secs(5), async {
+ let mut buf = [0u8; 4096];
+ loop {
+ let n = stream.read(&mut buf).await?;
+ if n == 0 {
+ break;
+ }
+ let slice = buf
+ .get(..n)
+ .ok_or_else(|| std::io::Error::other("read returned out-of-bounds length"))?;
+ response.extend_from_slice(slice);
+ }
+ Ok::<_, std::io::Error>(())
+ })
+ .await??;
+
+ Ok(String::from_utf8_lossy(&response).into_owned())
+ }
+
+ /// Gracefully shut the server down and await task exit.
+ async fn stop(mut self) {
+ let _ = self.shutdown_tx.send(true);
+ if let Some(handle) = self.handle.take() {
+ tokio::time::timeout(Duration::from_secs(5), handle)
+ .await
+ .ok();
+ }
+ }
+}
+
+impl Drop for TestServer {
+ fn drop(&mut self) {
+ // Best-effort signal if the test panics before calling `stop()`.
+ let _ = self.shutdown_tx.send(true);
+ }
+}
+
+// ─── Config + fixture helpers ─────────────────────────────────────────────────
+
+/// Build a minimal [`Config`] whose site directory matches `site_root`.
+fn build_test_config(site_root: &Path, port: u16) -> Config {
+ use std::num::NonZeroU16;
+
+ let mut config = Config::default();
+ config.server.port = NonZeroU16::new(port).unwrap_or(NonZeroU16::MIN);
+ // auto_port_fallback = false: the server must bind exactly `port`.
+ config.server.auto_port_fallback = false;
+ config.server.open_browser_on_start = false;
+ config.server.max_connections = 16;
+ // Use the directory basename; server joins data_dir + this name.
+ config.site.directory = String::from(
+ site_root
+ .file_name()
+ .and_then(|n| n.to_str())
+ .unwrap_or("site"),
+ );
+ config.site.index_file = "index.html".into();
+ config.site.enable_directory_listing = false;
+ config.tor.enabled = false;
+ config.console.interactive = false;
+ config
+}
+
+/// Create a temporary `/site/` tree.
+///
+/// Returns `(TempDir, site_path)`. The caller must keep `TempDir` alive for
+/// the duration of the test.
+fn make_site(
+ files: &[(&str, &[u8])],
+) -> Result<(tempfile::TempDir, std::path::PathBuf), Box> {
+ let tmp = tempfile::tempdir()?;
+ let site = tmp.path().join("site");
+ std::fs::create_dir_all(&site)?;
+ for (name, content) in files {
+ std::fs::write(site.join(name), content)?;
+ }
+ Ok((tmp, site))
+}
+
+// ─── Response assertion helpers ───────────────────────────────────────────────
+
+/// Extract the numeric HTTP status code from the response status line.
+fn status_code(response: &str) -> Option {
+ response.split_whitespace().nth(1)?.parse().ok()
+}
+
+/// `true` when there are no bytes after the `\r\n\r\n` header terminator.
+fn body_is_empty(response: &str) -> bool {
+ response
+ .find("\r\n\r\n")
+ .is_none_or(|sep| response.len() == sep.saturating_add(4))
+}
+
+/// `true` when the named header appears in the response (case-insensitive).
+fn has_header(response: &str, name: &str) -> bool {
+ let name_lc = name.to_ascii_lowercase();
+ response
+ .lines()
+ .skip(1) // skip status line
+ .any(|l| l.to_ascii_lowercase().starts_with(&name_lc))
+}
+
+// ─── Core HTTP flow tests (task 5.2) ─────────────────────────────────────────
+
+#[tokio::test]
+async fn get_index_html_returns_200() -> Result<(), Box> {
+ let (tmp, site) = make_site(&[("index.html", b"hello
")])?;
+ let server = TestServer::start(&site).await?;
+
+ let response = server
+ .send(b"GET /index.html HTTP/1.1\r\nHost: localhost\r\n\r\n")
+ .await?;
+ server.stop().await;
+ drop(tmp);
+
+ assert_eq!(
+ status_code(&response),
+ Some(200),
+ "GET /index.html must return 200:\n{response}"
+ );
+ Ok(())
+}
+
+#[tokio::test]
+async fn head_request_returns_headers_no_body() -> Result<(), Box> {
+ let (tmp, site) = make_site(&[("index.html", b"hello
")])?;
+ let server = TestServer::start(&site).await?;
+
+ let response = server
+ .send(b"HEAD /index.html HTTP/1.1\r\nHost: localhost\r\n\r\n")
+ .await?;
+ server.stop().await;
+ drop(tmp);
+
+ assert_eq!(
+ status_code(&response),
+ Some(200),
+ "HEAD must return 200:\n{response}"
+ );
+ assert!(
+ has_header(&response, "content-length"),
+ "HEAD must include Content-Length:\n{response}"
+ );
+ assert!(
+ body_is_empty(&response),
+ "HEAD must not include a body:\n{response}"
+ );
+ Ok(())
+}
+
+#[tokio::test]
+async fn get_root_with_index_file_serves_200() -> Result<(), Box> {
+ let (tmp, site) = make_site(&[("index.html", b"root
")])?;
+ let server = TestServer::start(&site).await?;
+
+ let response = server
+ .send(b"GET / HTTP/1.1\r\nHost: localhost\r\n\r\n")
+ .await?;
+ server.stop().await;
+ drop(tmp);
+
+ assert_eq!(
+ status_code(&response),
+ Some(200),
+ "GET / must serve index.html (200):\n{response}"
+ );
+ Ok(())
+}
+
+#[tokio::test]
+async fn directory_traversal_returns_403() -> Result<(), Box> {
+ let (tmp, site) = make_site(&[("index.html", b"safe")])?;
+ let server = TestServer::start(&site).await?;
+
+ let response = server
+ .send(b"GET /../etc/passwd HTTP/1.1\r\nHost: localhost\r\n\r\n")
+ .await?;
+ server.stop().await;
+ drop(tmp);
+
+ assert_eq!(
+ status_code(&response),
+ Some(403),
+ "traversal must return 403:\n{response}"
+ );
+ Ok(())
+}
+
+#[tokio::test]
+async fn oversized_request_header_returns_400() -> Result<(), Box> {
+ let (tmp, site) = make_site(&[("index.html", b"ok")])?;
+ let server = TestServer::start(&site).await?;
+
+ // Build headers that exceed the 8 KiB limit enforced by `read_request`.
+ let padding = format!("X-Padding: {}\r\n", "A".repeat(8_300));
+ let request = format!("GET / HTTP/1.1\r\nHost: localhost\r\n{padding}\r\n");
+
+ let response = server.send(request.as_bytes()).await?;
+ server.stop().await;
+ drop(tmp);
+
+ assert_eq!(
+ status_code(&response),
+ Some(400),
+ "oversized headers must return 400:\n{response}"
+ );
+ Ok(())
+}
+
+#[tokio::test]
+async fn get_nonexistent_file_returns_404() -> Result<(), Box> {
+ let (tmp, site) = make_site(&[("index.html", b"ok")])?;
+ let server = TestServer::start(&site).await?;
+
+ let response = server
+ .send(b"GET /nonexistent.txt HTTP/1.1\r\nHost: localhost\r\n\r\n")
+ .await?;
+ server.stop().await;
+ drop(tmp);
+
+ assert_eq!(
+ status_code(&response),
+ Some(404),
+ "missing file must return 404:\n{response}"
+ );
+ Ok(())
+}
+
+#[tokio::test]
+async fn post_request_returns_400() -> Result<(), Box> {
+ let (tmp, site) = make_site(&[("index.html", b"ok")])?;
+ let server = TestServer::start(&site).await?;
+
+ let response = server
+ .send(b"POST /index.html HTTP/1.1\r\nHost: localhost\r\nContent-Length: 0\r\n\r\n")
+ .await?;
+ server.stop().await;
+ drop(tmp);
+
+ assert_eq!(
+ status_code(&response),
+ Some(400),
+ "POST must be rejected with 400:\n{response}"
+ );
+ Ok(())
+}
+
+// ─── Security header tests (task 5.3 — integration verification) ─────────────
+
+#[tokio::test]
+async fn all_security_headers_present_on_html_response() -> Result<(), Box> {
+ let (tmp, site) = make_site(&[("index.html", b"ok
")])?;
+ let server = TestServer::start(&site).await?;
+
+ let response = server
+ .send(b"GET /index.html HTTP/1.1\r\nHost: localhost\r\n\r\n")
+ .await?;
+ server.stop().await;
+ drop(tmp);
+
+ // All five headers must be present on an HTML response.
+ for header in &[
+ "x-content-type-options",
+ "x-frame-options",
+ "referrer-policy",
+ "permissions-policy",
+ "content-security-policy",
+ ] {
+ assert!(
+ has_header(&response, header),
+ "missing security header '{header}' on HTML:\n{response}"
+ );
+ }
+ Ok(())
+}
+
+#[tokio::test]
+async fn csp_absent_and_base_headers_present_on_non_html_response(
+) -> Result<(), Box> {
+ let (tmp, site) = make_site(&[("style.css", b"body{color:red}")])?;
+ let server = TestServer::start(&site).await?;
+
+ let response = server
+ .send(b"GET /style.css HTTP/1.1\r\nHost: localhost\r\n\r\n")
+ .await?;
+ server.stop().await;
+ drop(tmp);
+
+ // The four universal headers must be present on all response types.
+ for header in &[
+ "x-content-type-options",
+ "x-frame-options",
+ "referrer-policy",
+ "permissions-policy",
+ ] {
+ assert!(
+ has_header(&response, header),
+ "missing security header '{header}' on CSS:\n{response}"
+ );
+ }
+ // Content-Security-Policy must NOT appear on non-HTML responses.
+ assert!(
+ !has_header(&response, "content-security-policy"),
+ "CSP must not appear on CSS responses:\n{response}"
+ );
+ Ok(())
+}