feat(linux-port): bootstrap MVP boxctl, config loader, supervisor, firewall skeleton, and systemd units by JadenJSJ · Pull Request #1 · JSJ-Experiments/box4linux

JadenJSJ · 2026-03-05T14:21:27Z

Summary by CodeRabbit

Release Notes

New Features
- Added boxctl command-line tool for managing Box services and firewall
- Firewall support with DNS hijacking, routing modes, and Tailscale coexistence
- Systemd integration for Box service and firewall management
- Configuration system with TOML file support
- Service supervision for proxy cores (Mihomo, Sing-box)
Documentation
- Project setup and quick start guide
- Firewall configuration and routing policies
- Deployment and risk assessment documentation
Tests
- Comprehensive integration test suite for service and firewall operations

…rewall skeleton, and systemd units

coderabbitai · 2026-03-05T14:21:45Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 893effa0-a95e-4e6d-b589-6581e5af2715

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch linux-port-mvp-foundation

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…f convergence, and side-effect-free service status

JadenJSJ · 2026-03-05T15:10:19Z

@coderabbitai full review

coderabbitai · 2026-03-05T15:12:44Z

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 18

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/linux-port/06-component-network-policy-watchers.md`:
- Around line 50-53: The documentation's dns_coexist_mode values are out of sync
with runtime validation: replace the advertised `proxy_all` option with the
actual `strict_box` option and update the descriptions so `preserve_tailnet`
(default) still never hijacks the Tailscale resolver and routes
`*.ts.net`/MagicDNS via the system resolver path, while `strict_box` is the
explicit opt-in that allows full DNS hijack; ensure the doc text for
dns_coexist_mode lists the exact accepted token names
(`preserve_tailnet|strict_box`) and describes behaviors to match the runtime
validation and implementation.

In `@docs/linux-port/09-phase2-firewall-supervisor-tests.md`:
- Around line 3-55: The file triggers markdownlint MD022 because several section
headings (e.g., "Scope Implemented", "Firewall Apply Order", "Coexistence
Modes", "Route Pref Convergence", "Overlay Contract", "Known Gaps") are not
followed by a blank line; fix by inserting a single blank line immediately after
each of those heading lines so each heading is followed by an empty line (ensure
the headings exactly as written in the diff are updated).

In `@lib/common.sh`:
- Around line 69-71: The raw message variable is interpolated into log_line and
can contain quotes/newlines that break the structured log; add an escaping step
(e.g., a helper like escape_message) that transforms message by escaping
backslashes and double quotes and converting newlines/tabs to literal \n/\t
sequences, then use the escaped value when building log_line (so update the
log_line construction to use the escaped message instead of the raw message);
reference the timestamp_utc call remains unchanged and ensure the helper is
invoked before log_line is set.

In `@lib/config.sh`:
- Around line 63-71: strip_inline_comment currently checks the first character
before trimming leading whitespace, so values like key =   "a#b" get treated as
unquoted and truncated; change the logic in strip_inline_comment to detect the
first non-whitespace character (i.e., compute the first non-space char of local
value and use that for quote detection) and if that char is a double or single
quote, return the original value (preserving leading whitespace), otherwise
proceed to strip inline comments with value="${value%%#*}" as before; refer to
the strip_inline_comment function and the local value/first_char variables when
making this change.
- Around line 127-141: In config_detect_file(), don't allow an explicitly set
BOX_CONFIG_FILE to be ignored by falling back to dev_cfg: first check if
BOX_CONFIG_FILE is non-empty and if the referenced file exists—if it does, set
BOX_CONFIG_SOURCE (e.g., "env" or "explicit") and return 0; if BOX_CONFIG_FILE
is non-empty but the file does not exist, fail fast (log an error and return
non-zero) instead of continuing to check dev_cfg; otherwise continue with the
existing logic that checks system_cfg and dev_cfg. Ensure you reference the
BOX_CONFIG_FILE variable and the config_detect_file function when making the
change so the intent is clear.
- Around line 199-215: Validate that BOX_ROUTE_TABLE, BOX_TAILSCALE_ROUTE_TABLE,
and BOX_ROUTE_PREF are numeric (decimal integer) before doing comparisons: add
checks (e.g., a regex like ^[0-9]+$) for BOX_ROUTE_TABLE and
BOX_TAILSCALE_ROUTE_TABLE and for BOX_ROUTE_PREF, and log and return E_CONFIG
with a clear error code/message (reuse or add E_CONFIG_ROUTE_TABLE and
E_CONFIG_ROUTE_PREF) if any value is non-numeric or out of allowed bounds;
perform these validations before the existing comparisons in the same validation
block so the comparisons only run on confirmed numeric values.

In `@lib/firewall/backend_iptables.sh`:
- Around line 177-185: The policy-route setup in
backend_iptables_ensure_policy_route currently silences failures with "|| true";
instead remove those suppressions so failures are detected, and on any failing
ip rule or ip route command set FW_LAST_ERROR to a descriptive message and
return "${E_FIREWALL_APPLY}". Specifically, in
backend_iptables_ensure_policy_route (the ip_tool="$(ip_cmd)" function) check
the exit status of the ip rule add using
BOX_FWMARK/BOX_ROUTE_TABLE/BOX_ROUTE_PREF and the ip route add for table
BOX_ROUTE_TABLE; on failure assign FW_LAST_ERROR (include the failing
command/context) and return "${E_FIREWALL_APPLY}", and ensure callers of
backend_iptables_ensure_policy_route handle its non-zero return value
accordingly so policy-route configuration errors propagate.
- Around line 311-313: Replace the bare call to backend_iptables_init with an
explicit guard to match the function's other failure checks: call
backend_iptables_init inside an if ! ...; then set FW_LAST_ERROR to a
descriptive message (consistent with other checks) and return a non-zero status
(or perform the same failure handling used after cleanup/create_base); otherwise
proceed to clear FW_LAST_ERROR and continue. Ensure you reference
backend_iptables_init in your change and mirror the established error handling
pattern used elsewhere in this function.

In `@lib/firewall/firewall.sh`:
- Around line 132-161: The JSON output from firewall_status_json is missing the
conditional "error" field present in the text output; update
firewall_status_json to append an "error" JSON pair when FW_LAST_ERROR is
non-empty. Inside firewall_status_json (after backend_iptables_collect_status
and before the final printf closing brace), compute a local error_part (empty or
",$(json_pair \"error\" \"${FW_LAST_ERROR}\")") and include that variable in the
printf argument list so the JSON includes the error field only when
FW_LAST_ERROR is set; use the existing json_pair helper to ensure proper
quoting.

In `@lib/supervisor/adapter_mihomo.sh`:
- Around line 38-40: adapter_mihomo_reload currently always reports success
(returns 0) even though it’s a no-op; change it to explicitly indicate “not
implemented” by emitting a clear warning to stderr (or via the supervisor logger
used in this script) and returning a non-zero status (e.g., 1) so callers see
failure; keep the TODO comment referencing phase-2 and optionally add a short
note to open a follow-up issue for implementing the external-controller reload
endpoint.

In `@lib/supervisor/mutator_sing_box.sh`:
- Around line 11-31: The early return after successful jq rendering skips
creating the companion overlay env file; remove or move the `return 0` so that
after the `jq ... >"${rendered_file}"` step the script always writes
`"$rendered_file.overlay.env"` (populate it with the BOX_* vars:
BOX_NETWORK_MODE, BOX_DNS_HIJACK_MODE, BOX_TPROXY_PORT, BOX_REDIR_PORT,
BOX_DNS_PORT or equivalent) and only then exit; apply the same change to the
second jq block referenced (the block around the other jq call at lines 49-55)
so both paths produce the `.overlay.env` file before returning.

In `@lib/supervisor/supervisor.sh`:
- Around line 228-231: The restart currently calls service_stop and
service_start separately, leaving a race window; change restart to perform an
atomic locked restart by adding a helper service_restart_locked that calls
service_stop_locked then service_start_locked, and update service_restart to
call with_lock "service" 30 service_restart_locked (or alternatively wrap the
existing service_stop and service_start calls together inside with_lock
"service" 30) so the stop/start execute under a single lock.
- Around line 98-100: check_core_config currently returns the E_CORE_START error
code for unsupported cores without logging; mirror the behavior of
render_runtime_config and resolve_core_bin by logging a clear error before
returning. Update the unsupported-core branch in check_core_config to call the
existing logger (e.g., use process- or script-level logger pattern used
elsewhere) with a message like "Unsupported core: <core_name>" and the error
code, then return "${E_CORE_START}"; reference the check_core_config function
and the E_CORE_START symbol so the change is consistent with
render_runtime_config and resolve_core_bin.
- Around line 119-138: The runtime snapshot currently writes pid as a JSON
string in write_runtime_snapshot; make it a numeric value to match
service_print_status_json's json_num_pair usage by emitting "pid": without
surrounding quotes and ensuring the pid parameter is coerced to a numeric
default (e.g., 0) before rendering; keep rendered_config and config_source as
escaped strings using json_escape, and update the write_runtime_snapshot
function signature handling of pid accordingly so the JSON schema is consistent
with service_print_status_json and consumers expect a number.

In `@README.md`:
- Around line 14-53: The Markdown headings "Quick run (dev)", "Systemd units",
"Phase 2 notes", and "Remaining TODO" each lack the required blank line after
the heading; update README.md by inserting a single blank line immediately after
those heading lines so that the content following each heading is separated
(ensure the lines containing those exact headings are followed by an empty
line).

In `@tests/fixtures/mockbin/ip`:
- Around line 42-44: The table extraction is fragile because it assumes the
table ID is the last token (table="${*: -1}"); replace this with logic that
scans the arguments for the literal token "table" and uses the following token
as the table ID. For example, split "$@" into an array (tokens=("$@")), loop
over tokens and when tokens[i] == "table" set table="${tokens[i+1]}" (with a
guard that i+1 exists) so commands like 'ip route add ... table 100 dev eth0'
correctly pick up 100; update the conditional that currently checks [[ "${*}" ==
*" table "* ]] to use this scan and only set table when a following token is
present.

In `@tests/fixtures/mockbin/iptables`:
- Around line 62-91: Handlers for iptables options (-A, -F, -D, -X, and -C) must
validate that the target chain exists and return the real iptables failure when
it does not; before performing mutations in the -A, -F, -D and -X blocks call a
chain existence check (e.g., chain_exists "${chain}") and if it fails print to
stderr "No chain/target/match by that name." and exit 1, and for -C ensure it
returns non-zero and emits the same message when the chain is missing (while
keeping rule_exists for rule presence checks); use STATE_FILE/table/chain
variables as in the diff and ensure failures use exit status 1 rather than
silently succeeding.

In `@tests/integration/test_phase2.sh`:
- Around line 168-179: assert_magicdns_bypass_rules currently only checks for
UDP bypass rules; update it (and the duplicate checks around the other
occurrence) to also assert TCP rules exist. Specifically, in the function
assert_magicdns_bypass_rules add two grep assertions mirroring the UDP ones but
matching '-p tcp --dport 53 -j RETURN' for both the mangle
(RULE|mangle|BOX_DNS_MANGLE) and nat (RULE|nat|BOX_DNS_NAT) entries so TCP
resolver bypass rules are validated as well.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e8858011-4fcb-40a7-99d9-3e0fa8667118

📥 Commits

Reviewing files that changed from the base of the PR and between 4981c94 and 250ecde.

📒 Files selected for processing (26)

.gitignore
README.md
cmd/boxctl
docs/linux-port/04-component-firewall-routing.md
docs/linux-port/06-component-network-policy-watchers.md
docs/linux-port/08-delivery-plan-and-risk-register.md
docs/linux-port/09-phase2-firewall-supervisor-tests.md
docs/linux-port/10-steering-prompt-tailscale-dns.md
docs/linux-port/README.md
etc/box/box.toml
lib/common.sh
lib/config.sh
lib/firewall/backend_iptables.sh
lib/firewall/firewall.sh
lib/supervisor/adapter_mihomo.sh
lib/supervisor/adapter_sing_box.sh
lib/supervisor/mutator_mihomo.sh
lib/supervisor/mutator_sing_box.sh
lib/supervisor/supervisor.sh
systemd/box-firewall.service
systemd/box.service
tests/fixtures/mockbin/ip
tests/fixtures/mockbin/iptables
tests/fixtures/mockbin/mihomo
tests/fixtures/mockbin/sing-box
tests/integration/test_phase2.sh

📜 Review details

🧰 Additional context used

🧬 Code graph analysis (3)

lib/config.sh (1)

lib/common.sh (1)

log (62-77)

lib/supervisor/supervisor.sh (7)

lib/common.sh (8)

init_runtime_paths (49-56)

log (62-77)

timestamp_utc (58-60)

json_escape (97-105)

require_root (87-95)

read_pid_file (177-182)

is_pid_alive (172-175)

with_lock (135-154)

lib/supervisor/mutator_mihomo.sh (1)

mutator_mihomo_render_overlay (19-46)

lib/supervisor/mutator_sing_box.sh (1)

mutator_sing_box_render_overlay (7-56)

lib/supervisor/adapter_mihomo.sh (3)

adapter_mihomo_resolve_bin (7-19)

adapter_mihomo_check_config (21-26)

adapter_mihomo_start (28-36)

lib/supervisor/adapter_sing_box.sh (3)

adapter_sing_box_resolve_bin (7-19)

adapter_sing_box_check_config (21-26)

adapter_sing_box_start (28-36)

lib/config.sh (1)

load_config (217-267)

lib/firewall/firewall.sh (2)

firewall_enable (83-85)

firewall_disable (87-89)

lib/firewall/backend_iptables.sh (1)

lib/common.sh (1)

log (62-77)

🪛 LanguageTool

docs/linux-port/09-phase2-firewall-supervisor-tests.md

[style] ~26-~26: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...tailscale bypass and DNS exclusions. 5. Apply policy placeholder stage. 6. Apply mode...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~27-~27: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: .... 5. Apply policy placeholder stage. 6. Apply mode-specific rules. 7. Apply DNS strat...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~28-~28: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...stage. 6. Apply mode-specific rules. 7. Apply DNS strategy rules. 8. Ensure policy ro...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 markdownlint-cli2 (0.21.0)

README.md

[warning] 14-14: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below