feat(linux-port): bootstrap MVP boxctl, config loader, supervisor, firewall skeleton, and systemd units#1
feat(linux-port): bootstrap MVP boxctl, config loader, supervisor, firewall skeleton, and systemd units#1
Conversation
…rewall skeleton, and systemd units
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
…f convergence, and side-effect-free service status
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 18
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/linux-port/06-component-network-policy-watchers.md`:
- Around line 50-53: The documentation's dns_coexist_mode values are out of sync
with runtime validation: replace the advertised `proxy_all` option with the
actual `strict_box` option and update the descriptions so `preserve_tailnet`
(default) still never hijacks the Tailscale resolver and routes
`*.ts.net`/MagicDNS via the system resolver path, while `strict_box` is the
explicit opt-in that allows full DNS hijack; ensure the doc text for
dns_coexist_mode lists the exact accepted token names
(`preserve_tailnet|strict_box`) and describes behaviors to match the runtime
validation and implementation.
In `@docs/linux-port/09-phase2-firewall-supervisor-tests.md`:
- Around line 3-55: The file triggers markdownlint MD022 because several section
headings (e.g., "Scope Implemented", "Firewall Apply Order", "Coexistence
Modes", "Route Pref Convergence", "Overlay Contract", "Known Gaps") are not
followed by a blank line; fix by inserting a single blank line immediately after
each of those heading lines so each heading is followed by an empty line (ensure
the headings exactly as written in the diff are updated).
In `@lib/common.sh`:
- Around line 69-71: The raw message variable is interpolated into log_line and
can contain quotes/newlines that break the structured log; add an escaping step
(e.g., a helper like escape_message) that transforms message by escaping
backslashes and double quotes and converting newlines/tabs to literal \n/\t
sequences, then use the escaped value when building log_line (so update the
log_line construction to use the escaped message instead of the raw message);
reference the timestamp_utc call remains unchanged and ensure the helper is
invoked before log_line is set.
In `@lib/config.sh`:
- Around line 63-71: strip_inline_comment currently checks the first character
before trimming leading whitespace, so values like key = "a#b" get treated as
unquoted and truncated; change the logic in strip_inline_comment to detect the
first non-whitespace character (i.e., compute the first non-space char of local
value and use that for quote detection) and if that char is a double or single
quote, return the original value (preserving leading whitespace), otherwise
proceed to strip inline comments with value="${value%%#*}" as before; refer to
the strip_inline_comment function and the local value/first_char variables when
making this change.
- Around line 127-141: In config_detect_file(), don't allow an explicitly set
BOX_CONFIG_FILE to be ignored by falling back to dev_cfg: first check if
BOX_CONFIG_FILE is non-empty and if the referenced file exists—if it does, set
BOX_CONFIG_SOURCE (e.g., "env" or "explicit") and return 0; if BOX_CONFIG_FILE
is non-empty but the file does not exist, fail fast (log an error and return
non-zero) instead of continuing to check dev_cfg; otherwise continue with the
existing logic that checks system_cfg and dev_cfg. Ensure you reference the
BOX_CONFIG_FILE variable and the config_detect_file function when making the
change so the intent is clear.
- Around line 199-215: Validate that BOX_ROUTE_TABLE, BOX_TAILSCALE_ROUTE_TABLE,
and BOX_ROUTE_PREF are numeric (decimal integer) before doing comparisons: add
checks (e.g., a regex like ^[0-9]+$) for BOX_ROUTE_TABLE and
BOX_TAILSCALE_ROUTE_TABLE and for BOX_ROUTE_PREF, and log and return E_CONFIG
with a clear error code/message (reuse or add E_CONFIG_ROUTE_TABLE and
E_CONFIG_ROUTE_PREF) if any value is non-numeric or out of allowed bounds;
perform these validations before the existing comparisons in the same validation
block so the comparisons only run on confirmed numeric values.
In `@lib/firewall/backend_iptables.sh`:
- Around line 177-185: The policy-route setup in
backend_iptables_ensure_policy_route currently silences failures with "|| true";
instead remove those suppressions so failures are detected, and on any failing
ip rule or ip route command set FW_LAST_ERROR to a descriptive message and
return "${E_FIREWALL_APPLY}". Specifically, in
backend_iptables_ensure_policy_route (the ip_tool="$(ip_cmd)" function) check
the exit status of the ip rule add using
BOX_FWMARK/BOX_ROUTE_TABLE/BOX_ROUTE_PREF and the ip route add for table
BOX_ROUTE_TABLE; on failure assign FW_LAST_ERROR (include the failing
command/context) and return "${E_FIREWALL_APPLY}", and ensure callers of
backend_iptables_ensure_policy_route handle its non-zero return value
accordingly so policy-route configuration errors propagate.
- Around line 311-313: Replace the bare call to backend_iptables_init with an
explicit guard to match the function's other failure checks: call
backend_iptables_init inside an if ! ...; then set FW_LAST_ERROR to a
descriptive message (consistent with other checks) and return a non-zero status
(or perform the same failure handling used after cleanup/create_base); otherwise
proceed to clear FW_LAST_ERROR and continue. Ensure you reference
backend_iptables_init in your change and mirror the established error handling
pattern used elsewhere in this function.
In `@lib/firewall/firewall.sh`:
- Around line 132-161: The JSON output from firewall_status_json is missing the
conditional "error" field present in the text output; update
firewall_status_json to append an "error" JSON pair when FW_LAST_ERROR is
non-empty. Inside firewall_status_json (after backend_iptables_collect_status
and before the final printf closing brace), compute a local error_part (empty or
",$(json_pair \"error\" \"${FW_LAST_ERROR}\")") and include that variable in the
printf argument list so the JSON includes the error field only when
FW_LAST_ERROR is set; use the existing json_pair helper to ensure proper
quoting.
In `@lib/supervisor/adapter_mihomo.sh`:
- Around line 38-40: adapter_mihomo_reload currently always reports success
(returns 0) even though it’s a no-op; change it to explicitly indicate “not
implemented” by emitting a clear warning to stderr (or via the supervisor logger
used in this script) and returning a non-zero status (e.g., 1) so callers see
failure; keep the TODO comment referencing phase-2 and optionally add a short
note to open a follow-up issue for implementing the external-controller reload
endpoint.
In `@lib/supervisor/mutator_sing_box.sh`:
- Around line 11-31: The early return after successful jq rendering skips
creating the companion overlay env file; remove or move the `return 0` so that
after the `jq ... >"${rendered_file}"` step the script always writes
`"$rendered_file.overlay.env"` (populate it with the BOX_* vars:
BOX_NETWORK_MODE, BOX_DNS_HIJACK_MODE, BOX_TPROXY_PORT, BOX_REDIR_PORT,
BOX_DNS_PORT or equivalent) and only then exit; apply the same change to the
second jq block referenced (the block around the other jq call at lines 49-55)
so both paths produce the `.overlay.env` file before returning.
In `@lib/supervisor/supervisor.sh`:
- Around line 228-231: The restart currently calls service_stop and
service_start separately, leaving a race window; change restart to perform an
atomic locked restart by adding a helper service_restart_locked that calls
service_stop_locked then service_start_locked, and update service_restart to
call with_lock "service" 30 service_restart_locked (or alternatively wrap the
existing service_stop and service_start calls together inside with_lock
"service" 30) so the stop/start execute under a single lock.
- Around line 98-100: check_core_config currently returns the E_CORE_START error
code for unsupported cores without logging; mirror the behavior of
render_runtime_config and resolve_core_bin by logging a clear error before
returning. Update the unsupported-core branch in check_core_config to call the
existing logger (e.g., use process- or script-level logger pattern used
elsewhere) with a message like "Unsupported core: <core_name>" and the error
code, then return "${E_CORE_START}"; reference the check_core_config function
and the E_CORE_START symbol so the change is consistent with
render_runtime_config and resolve_core_bin.
- Around line 119-138: The runtime snapshot currently writes pid as a JSON
string in write_runtime_snapshot; make it a numeric value to match
service_print_status_json's json_num_pair usage by emitting "pid": without
surrounding quotes and ensuring the pid parameter is coerced to a numeric
default (e.g., 0) before rendering; keep rendered_config and config_source as
escaped strings using json_escape, and update the write_runtime_snapshot
function signature handling of pid accordingly so the JSON schema is consistent
with service_print_status_json and consumers expect a number.
In `@README.md`:
- Around line 14-53: The Markdown headings "Quick run (dev)", "Systemd units",
"Phase 2 notes", and "Remaining TODO" each lack the required blank line after
the heading; update README.md by inserting a single blank line immediately after
those heading lines so that the content following each heading is separated
(ensure the lines containing those exact headings are followed by an empty
line).
In `@tests/fixtures/mockbin/ip`:
- Around line 42-44: The table extraction is fragile because it assumes the
table ID is the last token (table="${*: -1}"); replace this with logic that
scans the arguments for the literal token "table" and uses the following token
as the table ID. For example, split "$@" into an array (tokens=("$@")), loop
over tokens and when tokens[i] == "table" set table="${tokens[i+1]}" (with a
guard that i+1 exists) so commands like 'ip route add ... table 100 dev eth0'
correctly pick up 100; update the conditional that currently checks [[ "${*}" ==
*" table "* ]] to use this scan and only set table when a following token is
present.
In `@tests/fixtures/mockbin/iptables`:
- Around line 62-91: Handlers for iptables options (-A, -F, -D, -X, and -C) must
validate that the target chain exists and return the real iptables failure when
it does not; before performing mutations in the -A, -F, -D and -X blocks call a
chain existence check (e.g., chain_exists "${chain}") and if it fails print to
stderr "No chain/target/match by that name." and exit 1, and for -C ensure it
returns non-zero and emits the same message when the chain is missing (while
keeping rule_exists for rule presence checks); use STATE_FILE/table/chain
variables as in the diff and ensure failures use exit status 1 rather than
silently succeeding.
In `@tests/integration/test_phase2.sh`:
- Around line 168-179: assert_magicdns_bypass_rules currently only checks for
UDP bypass rules; update it (and the duplicate checks around the other
occurrence) to also assert TCP rules exist. Specifically, in the function
assert_magicdns_bypass_rules add two grep assertions mirroring the UDP ones but
matching '-p tcp --dport 53 -j RETURN' for both the mangle
(RULE|mangle|BOX_DNS_MANGLE) and nat (RULE|nat|BOX_DNS_NAT) entries so TCP
resolver bypass rules are validated as well.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: e8858011-4fcb-40a7-99d9-3e0fa8667118
📒 Files selected for processing (26)
.gitignoreREADME.mdcmd/boxctldocs/linux-port/04-component-firewall-routing.mddocs/linux-port/06-component-network-policy-watchers.mddocs/linux-port/08-delivery-plan-and-risk-register.mddocs/linux-port/09-phase2-firewall-supervisor-tests.mddocs/linux-port/10-steering-prompt-tailscale-dns.mddocs/linux-port/README.mdetc/box/box.tomllib/common.shlib/config.shlib/firewall/backend_iptables.shlib/firewall/firewall.shlib/supervisor/adapter_mihomo.shlib/supervisor/adapter_sing_box.shlib/supervisor/mutator_mihomo.shlib/supervisor/mutator_sing_box.shlib/supervisor/supervisor.shsystemd/box-firewall.servicesystemd/box.servicetests/fixtures/mockbin/iptests/fixtures/mockbin/iptablestests/fixtures/mockbin/mihomotests/fixtures/mockbin/sing-boxtests/integration/test_phase2.sh
📜 Review details
🧰 Additional context used
🧬 Code graph analysis (3)
lib/config.sh (1)
lib/common.sh (1)
log(62-77)
lib/supervisor/supervisor.sh (7)
lib/common.sh (8)
init_runtime_paths(49-56)log(62-77)timestamp_utc(58-60)json_escape(97-105)require_root(87-95)read_pid_file(177-182)is_pid_alive(172-175)with_lock(135-154)lib/supervisor/mutator_mihomo.sh (1)
mutator_mihomo_render_overlay(19-46)lib/supervisor/mutator_sing_box.sh (1)
mutator_sing_box_render_overlay(7-56)lib/supervisor/adapter_mihomo.sh (3)
adapter_mihomo_resolve_bin(7-19)adapter_mihomo_check_config(21-26)adapter_mihomo_start(28-36)lib/supervisor/adapter_sing_box.sh (3)
adapter_sing_box_resolve_bin(7-19)adapter_sing_box_check_config(21-26)adapter_sing_box_start(28-36)lib/config.sh (1)
load_config(217-267)lib/firewall/firewall.sh (2)
firewall_enable(83-85)firewall_disable(87-89)
lib/firewall/backend_iptables.sh (1)
lib/common.sh (1)
log(62-77)
🪛 LanguageTool
docs/linux-port/09-phase2-firewall-supervisor-tests.md
[style] ~26-~26: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...tailscale bypass and DNS exclusions. 5. Apply policy placeholder stage. 6. Apply mode...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~27-~27: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: .... 5. Apply policy placeholder stage. 6. Apply mode-specific rules. 7. Apply DNS strat...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~28-~28: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...stage. 6. Apply mode-specific rules. 7. Apply DNS strategy rules. 8. Ensure policy ro...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🪛 markdownlint-cli2 (0.21.0)
README.md
[warning] 14-14: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 27-27: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 33-33: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 50-50: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
docs/linux-port/09-phase2-firewall-supervisor-tests.md
[warning] 14-14: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 27-27: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 33-33: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 50-50: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
🪛 Shellcheck (0.11.0)
lib/common.sh
[warning] 10-10: BOX_ETC_DIR_DEFAULT appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 27-27: E_CONFIG appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 28-28: E_CORE_START appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 29-29: E_FIREWALL_APPLY appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 30-30: E_POLICY appears unused. Verify use (or export if used externally).
(SC2034)
lib/supervisor/supervisor.sh
[info] 7-7: Not following: ./supervisor/adapter_mihomo.sh was not specified as input (see shellcheck -x).
(SC1091)
[info] 8-8: Not following: ./supervisor/adapter_sing_box.sh was not specified as input (see shellcheck -x).
(SC1091)
[info] 9-9: Not following: ./supervisor/mutator_mihomo.sh was not specified as input (see shellcheck -x).
(SC1091)
[info] 10-10: Not following: ./supervisor/mutator_sing_box.sh was not specified as input (see shellcheck -x).
(SC1091)
[info] 11-11: Not following: ./firewall/firewall.sh was not specified as input (see shellcheck -x).
(SC1091)
lib/firewall/firewall.sh
[info] 7-7: Not following: ./firewall/backend_iptables.sh was not specified as input (see shellcheck -x).
(SC1091)
lib/firewall/backend_iptables.sh
[style] 191-191: Use { ..; } instead of (..) to avoid subshell overhead.
(SC2235)
🔇 Additional comments (28)
tests/fixtures/mockbin/mihomo (1)
1-12: Mock lifecycle behavior is solid.
-tvalidation mode and TERM/INT handling are both implemented cleanly for integration tests.docs/linux-port/04-component-firewall-routing.md (1)
75-93: Tailscale coexistence requirements are clearly scoped.The added hard requirements and DNS guidance are concrete and testable.
.gitignore (1)
1-14: Ignore rules look appropriate for local/dev artifacts.docs/linux-port/08-delivery-plan-and-risk-register.md (1)
41-60: Risk and acceptance updates are well-targeted.The new Tailscale/MagicDNS risk and lifecycle preservation criteria materially improve release readiness.
Also applies to: 98-99
lib/supervisor/mutator_mihomo.sh (1)
7-46: Mihomo overlay rendering flow is clear and consistent.The source fallback + scalar mutation sequence is straightforward and suitable for supervisor rendering.
etc/box/box.toml (1)
1-25: Default TOML profile is consistent with the loader contract.Keys and default values align cleanly with the config loader/validator surface.
docs/linux-port/README.md (1)
17-18: Document index additions are clear and properly placed.Good update to keep Phase 2 docs discoverable from the main Linux-port index.
systemd/box-firewall.service (1)
1-16: Firewall unit lifecycle is cleanly defined for oneshot operation.Start/stop/reload actions and timeout bounds look consistent with the MVP control flow.
systemd/box.service (1)
1-16: Core supervisor unit is well-structured for command-driven lifecycle control.
ExecStart/ExecStop/ExecReloadmappings and timeout settings are coherent.tests/fixtures/mockbin/sing-box (1)
5-14: Mock command behavior is stable for supervisor integration tests.The
checkfast-success path andrunsignal-trapped loop are clean and predictable for test orchestration.cmd/boxctl (1)
23-49: Dispatch flow and error-path behavior look correct.
service/firewallrouting, JSON output toggle, and unknown-component handling are implemented clearly.docs/linux-port/10-steering-prompt-tailscale-dns.md (1)
5-41: Constraints and verification checklist are well-scoped.The prompt clearly encodes non-destructive coexistence requirements and concrete verification expectations.
tests/integration/test_phase2.sh (1)
274-348: Phase coverage is strong and aligns with the MVP objectives.The six-phase integration flow provides solid end-to-end coverage for coexistence, route convergence, and both core adapters.
lib/supervisor/adapter_sing_box.sh (1)
13-16: This is not an issue.BOX_CORE_BIN_DIRis initialized with a default value at the module level inlib/config.sh(line 29:BOX_CORE_BIN_DIR="/usr/local/bin"), andconfig.shis always sourced beforesupervisor.shin the entry point. Therefore, the variable is guaranteed to be set beforeadapter_sing_box_resolve_binexecutes, and no unbound-variable error will occur.> Likely an incorrect or invalid review comment.lib/firewall/firewall.sh (7)
1-7: LGTM!Good use of
set -euo pipefailfor strict error handling. The ShellCheck SC1091 info about the sourced file is expected and safe to ignore since the file exists and is provided viaBOX_LIB_DIR.
9-44: LGTM!State file management is clean. The
awkparsing infirewall_read_state_valuecorrectly handles values containing=characters by usingsubstrto get everything after the first=.
46-66: LGTM!Good structure with proper root check, config loading, backend dispatch, state persistence, and structured logging. Error propagation with
FW_LAST_ERRORprovides useful debugging context.
68-81: LGTM!Using
WARNfor unknown backend during disable is appropriate—being permissive here prevents blocking cleanup operations. State is correctly marked as disabled regardless of backend cleanup outcome.
91-99: LGTM!The sequential disable-then-enable approach is correct for renewal. With
set -e, if disable fails, enable won't be attempted, which is the safe behavior for a renew operation.
172-184: LGTM!Clean command interface with proper stderr usage output and conventional return code 2 for invalid actions.
127-129: The variableFW_LAST_ERRORis properly initialized and exported, so no fix is needed.The variable is initialized as an empty string in
backend_iptables.sh(line 23) and exported (line 385). Sincebackend_iptables.shis sourced at the top offirewall.sh(line 7),FW_LAST_ERRORis always defined beforefirewall_status_text()is called, even withset -uenabled. The current code on lines 127-129 is correct as-is.> Likely an incorrect or invalid review comment.lib/supervisor/supervisor.sh (7)
1-11: LGTM!Proper setup with strict mode and organized sourcing of adapters, mutators, and firewall module. ShellCheck SC1091 warnings are expected for external sourced files.
13-65: LGTM!Good separation between read-write path helpers (that create directories) and read-only variants (for status checks). The core-specific config extensions (yaml vs json) are handled consistently.
104-117: LGTM!Consistent dispatch pattern with the other core-type functions. The silent return for unsupported cores (line 113-115) follows the same pattern as
check_core_config—both could benefit from logging, but sinceresolve_core_binlogs and runs first, an unsupported core would already be caught.
140-193: LGTM!Well-structured startup sequence with proper idempotency (early return if running), staged validation (binary → config → process), and rollback on firewall failure. The 1-second sleep for process stabilization check is a reasonable heuristic for this context.
195-218: LGTM!Proper graceful shutdown with SIGTERM→wait→SIGKILL escalation. Using
|| trueforfirewall_disableensures stop completes even if firewall cleanup fails, which is the correct behavior for a stop operation.
260-283: LGTM!Excellent design for a status command: temporarily disables log file writes to avoid side effects, uses read-only path variants, and tolerates config errors with
|| true. This aligns with the PR commit message about "side-effect-free service status."
285-297: LGTM!Clean command interface consistent with the
firewall_cmdpattern. Proper stderr usage output and return code 2 for invalid actions.
| - Add a `dns_coexist_mode` policy: | ||
| - `preserve_tailnet` (default): never hijack Tailscale DNS resolver/domain set. | ||
| - `proxy_all`: allow full DNS hijack (explicit opt-in). | ||
| - If `preserve_tailnet`, route `*.ts.net`/MagicDNS via system resolver path and bypass proxy DNS interception for Tailscale resolver. |
There was a problem hiding this comment.
dns_coexist_mode value here is out of sync with runtime validation.
This doc advertises proxy_all, but lib/config.sh (Lines 191-193) accepts preserve_tailnet|strict_box. Following this doc can produce invalid config.
Proposed doc fix
- Add a `dns_coexist_mode` policy:
- `preserve_tailnet` (default): never hijack Tailscale DNS resolver/domain set.
-- `proxy_all`: allow full DNS hijack (explicit opt-in).
+- `strict_box`: allow full DNS hijack (explicit opt-in).📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - Add a `dns_coexist_mode` policy: | |
| - `preserve_tailnet` (default): never hijack Tailscale DNS resolver/domain set. | |
| - `proxy_all`: allow full DNS hijack (explicit opt-in). | |
| - If `preserve_tailnet`, route `*.ts.net`/MagicDNS via system resolver path and bypass proxy DNS interception for Tailscale resolver. | |
| - Add a `dns_coexist_mode` policy: | |
| - `preserve_tailnet` (default): never hijack Tailscale DNS resolver/domain set. | |
| - `strict_box`: allow full DNS hijack (explicit opt-in). | |
| - If `preserve_tailnet`, route `*.ts.net`/MagicDNS via system resolver path and bypass proxy DNS interception for Tailscale resolver. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/linux-port/06-component-network-policy-watchers.md` around lines 50 -
53, The documentation's dns_coexist_mode values are out of sync with runtime
validation: replace the advertised `proxy_all` option with the actual
`strict_box` option and update the descriptions so `preserve_tailnet` (default)
still never hijacks the Tailscale resolver and routes `*.ts.net`/MagicDNS via
the system resolver path, while `strict_box` is the explicit opt-in that allows
full DNS hijack; ensure the doc text for dns_coexist_mode lists the exact
accepted token names (`preserve_tailnet|strict_box`) and describes behaviors to
match the runtime validation and implementation.
| ## Scope Implemented | ||
| - Firewall staged apply/rollback in `lib/firewall/backend_iptables.sh` | ||
| - Modes: `tun`, `tproxy`, `redirect`, `mixed`, `enhance` | ||
| - DNS hijack strategies: `tproxy`, `redirect`, `disable` | ||
| - Tailscale coexistence safeguards: | ||
| - preserve existing tailscale policy/routing ownership | ||
| - never flush non-BOX rules/routes | ||
| - bypass `tailscale0`, `100.64.0.0/10`, and resolver `100.100.100.100:53` | ||
| - expose tailscale preservation flags in `firewall status --json` | ||
| - Core runtime overlays for `mihomo` and `sing-box`: | ||
| - `lib/supervisor/mutator_mihomo.sh` | ||
| - `lib/supervisor/mutator_sing_box.sh` | ||
| - JSON status output: | ||
| - `boxctl service status --json` | ||
| - `boxctl firewall status --json` | ||
| - Integration test harness: | ||
| - `tests/integration/test_phase2.sh` | ||
|
|
||
| ## Firewall Apply Order | ||
| 1. Cleanup existing BOX-owned objects. | ||
| 2. Create base BOX chains and attach table jumps. | ||
| 3. Apply anti-loop guards. | ||
| 4. Apply tailscale bypass and DNS exclusions. | ||
| 5. Apply policy placeholder stage. | ||
| 6. Apply mode-specific rules. | ||
| 7. Apply DNS strategy rules. | ||
| 8. Ensure policy routing for mark-based paths. | ||
| 9. On any failure: full cleanup rollback. | ||
|
|
||
| ## Coexistence Modes | ||
| - `preserve_tailnet` (default): | ||
| - apply tailscale bypass for `tailscale0` | ||
| - bypass tailnet CIDR `100.64.0.0/10` | ||
| - exclude `100.100.100.100:53` from DNS hijack | ||
| - `strict_box`: | ||
| - skip tailscale bypass and MagicDNS exclusion rule insertion | ||
| - still keep non-destructive cleanup boundaries (BOX-owned objects only) | ||
|
|
||
| ## Route Pref Convergence | ||
| - Renew/reapply path prunes stale BOX policy rules matching the configured Box fwmark+table regardless of previous `pref`. | ||
| - After apply there must be exactly one BOX fwmark rule at the current `route_pref`. | ||
| - Tailscale rules (e.g., `fwmark 0x80000/0xff0000` and table `52`) are never targeted. | ||
|
|
||
|
|
||
| ## Overlay Contract | ||
| - Source config files are never edited in place. | ||
| - Rendered runtime config path: | ||
| - `${BOX_RUN_DIR}/rendered/mihomo/config.yaml` | ||
| - `${BOX_RUN_DIR}/rendered/sing-box/config.json` | ||
| - Default run dir is `/run/box`, with repo-local fallback via runtime path resolver. | ||
|
|
||
| ## Known Gaps | ||
| - UID/GID/interface/MAC policy filters are placeholders. |
There was a problem hiding this comment.
Add blank lines below section headings (MD022).
Headings (e.g., around Line 3, Line 21, Line 32, Line 41, Line 47, Line 54) should be followed by a blank line for markdownlint compliance.
📝 Proposed markdown fix
## Scope Implemented
+
- Firewall staged apply/rollback in `lib/firewall/backend_iptables.sh`
@@
## Firewall Apply Order
+
1. Cleanup existing BOX-owned objects.
@@
## Coexistence Modes
+
- `preserve_tailnet` (default):
@@
## Route Pref Convergence
+
- Renew/reapply path prunes stale BOX policy rules matching the configured Box fwmark+table regardless of previous `pref`.
@@
## Overlay Contract
+
- Source config files are never edited in place.
@@
## Known Gaps
+
- UID/GID/interface/MAC policy filters are placeholders.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ## Scope Implemented | |
| - Firewall staged apply/rollback in `lib/firewall/backend_iptables.sh` | |
| - Modes: `tun`, `tproxy`, `redirect`, `mixed`, `enhance` | |
| - DNS hijack strategies: `tproxy`, `redirect`, `disable` | |
| - Tailscale coexistence safeguards: | |
| - preserve existing tailscale policy/routing ownership | |
| - never flush non-BOX rules/routes | |
| - bypass `tailscale0`, `100.64.0.0/10`, and resolver `100.100.100.100:53` | |
| - expose tailscale preservation flags in `firewall status --json` | |
| - Core runtime overlays for `mihomo` and `sing-box`: | |
| - `lib/supervisor/mutator_mihomo.sh` | |
| - `lib/supervisor/mutator_sing_box.sh` | |
| - JSON status output: | |
| - `boxctl service status --json` | |
| - `boxctl firewall status --json` | |
| - Integration test harness: | |
| - `tests/integration/test_phase2.sh` | |
| ## Firewall Apply Order | |
| 1. Cleanup existing BOX-owned objects. | |
| 2. Create base BOX chains and attach table jumps. | |
| 3. Apply anti-loop guards. | |
| 4. Apply tailscale bypass and DNS exclusions. | |
| 5. Apply policy placeholder stage. | |
| 6. Apply mode-specific rules. | |
| 7. Apply DNS strategy rules. | |
| 8. Ensure policy routing for mark-based paths. | |
| 9. On any failure: full cleanup rollback. | |
| ## Coexistence Modes | |
| - `preserve_tailnet` (default): | |
| - apply tailscale bypass for `tailscale0` | |
| - bypass tailnet CIDR `100.64.0.0/10` | |
| - exclude `100.100.100.100:53` from DNS hijack | |
| - `strict_box`: | |
| - skip tailscale bypass and MagicDNS exclusion rule insertion | |
| - still keep non-destructive cleanup boundaries (BOX-owned objects only) | |
| ## Route Pref Convergence | |
| - Renew/reapply path prunes stale BOX policy rules matching the configured Box fwmark+table regardless of previous `pref`. | |
| - After apply there must be exactly one BOX fwmark rule at the current `route_pref`. | |
| - Tailscale rules (e.g., `fwmark 0x80000/0xff0000` and table `52`) are never targeted. | |
| ## Overlay Contract | |
| - Source config files are never edited in place. | |
| - Rendered runtime config path: | |
| - `${BOX_RUN_DIR}/rendered/mihomo/config.yaml` | |
| - `${BOX_RUN_DIR}/rendered/sing-box/config.json` | |
| - Default run dir is `/run/box`, with repo-local fallback via runtime path resolver. | |
| ## Known Gaps | |
| - UID/GID/interface/MAC policy filters are placeholders. | |
| ## Scope Implemented | |
| - Firewall staged apply/rollback in `lib/firewall/backend_iptables.sh` | |
| - Modes: `tun`, `tproxy`, `redirect`, `mixed`, `enhance` | |
| - DNS hijack strategies: `tproxy`, `redirect`, `disable` | |
| - Tailscale coexistence safeguards: | |
| - preserve existing tailscale policy/routing ownership | |
| - never flush non-BOX rules/routes | |
| - bypass `tailscale0`, `100.64.0.0/10`, and resolver `100.100.100.100:53` | |
| - expose tailscale preservation flags in `firewall status --json` | |
| - Core runtime overlays for `mihomo` and `sing-box`: | |
| - `lib/supervisor/mutator_mihomo.sh` | |
| - `lib/supervisor/mutator_sing_box.sh` | |
| - JSON status output: | |
| - `boxctl service status --json` | |
| - `boxctl firewall status --json` | |
| - Integration test harness: | |
| - `tests/integration/test_phase2.sh` | |
| ## Firewall Apply Order | |
| 1. Cleanup existing BOX-owned objects. | |
| 2. Create base BOX chains and attach table jumps. | |
| 3. Apply anti-loop guards. | |
| 4. Apply tailscale bypass and DNS exclusions. | |
| 5. Apply policy placeholder stage. | |
| 6. Apply mode-specific rules. | |
| 7. Apply DNS strategy rules. | |
| 8. Ensure policy routing for mark-based paths. | |
| 9. On any failure: full cleanup rollback. | |
| ## Coexistence Modes | |
| - `preserve_tailnet` (default): | |
| - apply tailscale bypass for `tailscale0` | |
| - bypass tailnet CIDR `100.64.0.0/10` | |
| - exclude `100.100.100.100:53` from DNS hijack | |
| - `strict_box`: | |
| - skip tailscale bypass and MagicDNS exclusion rule insertion | |
| - still keep non-destructive cleanup boundaries (BOX-owned objects only) | |
| ## Route Pref Convergence | |
| - Renew/reapply path prunes stale BOX policy rules matching the configured Box fwmark+table regardless of previous `pref`. | |
| - After apply there must be exactly one BOX fwmark rule at the current `route_pref`. | |
| - Tailscale rules (e.g., `fwmark 0x80000/0xff0000` and table `52`) are never targeted. | |
| ## Overlay Contract | |
| - Source config files are never edited in place. | |
| - Rendered runtime config path: | |
| - `${BOX_RUN_DIR}/rendered/mihomo/config.yaml` | |
| - `${BOX_RUN_DIR}/rendered/sing-box/config.json` | |
| - Default run dir is `/run/box`, with repo-local fallback via runtime path resolver. | |
| ## Known Gaps | |
| - UID/GID/interface/MAC policy filters are placeholders. |
🧰 Tools
🪛 LanguageTool
[style] ~26-~26: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...tailscale bypass and DNS exclusions. 5. Apply policy placeholder stage. 6. Apply mode...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~27-~27: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: .... 5. Apply policy placeholder stage. 6. Apply mode-specific rules. 7. Apply DNS strat...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~28-~28: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...stage. 6. Apply mode-specific rules. 7. Apply DNS strategy rules. 8. Ensure policy ro...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🪛 markdownlint-cli2 (0.21.0)
[warning] 14-14: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 27-27: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 33-33: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 50-50: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/linux-port/09-phase2-firewall-supervisor-tests.md` around lines 3 - 55,
The file triggers markdownlint MD022 because several section headings (e.g.,
"Scope Implemented", "Firewall Apply Order", "Coexistence Modes", "Route Pref
Convergence", "Overlay Contract", "Known Gaps") are not followed by a blank
line; fix by inserting a single blank line immediately after each of those
heading lines so each heading is followed by an empty line (ensure the headings
exactly as written in the diff are updated).
| ts="$(timestamp_utc)" | ||
| log_line="ts=${ts} level=${level} component=${component} event_id=${event_id} msg=\"${message}\"" | ||
| printf '%s\n' "${log_line}" >&2 |
There was a problem hiding this comment.
Escape message before embedding it into the structured log line.
At Line 70, raw message is interpolated into msg="...". Quotes/newlines can break parsing and spoof fields.
Proposed fix
log() {
local level="${1:-INFO}"
local component="${2:-main}"
local event_id="${3:-GENERIC}"
local message="${4:-}"
- local ts log_line log_file
+ local ts log_line log_file safe_message
ts="$(timestamp_utc)"
- log_line="ts=${ts} level=${level} component=${component} event_id=${event_id} msg=\"${message}\""
+ safe_message="$(json_escape "${message}")"
+ log_line="ts=${ts} level=${level} component=${component} event_id=${event_id} msg=\"${safe_message}\""
printf '%s\n' "${log_line}" >&2🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/common.sh` around lines 69 - 71, The raw message variable is interpolated
into log_line and can contain quotes/newlines that break the structured log; add
an escaping step (e.g., a helper like escape_message) that transforms message by
escaping backslashes and double quotes and converting newlines/tabs to literal
\n/\t sequences, then use the escaped value when building log_line (so update
the log_line construction to use the escaped message instead of the raw
message); reference the timestamp_utc call remains unchanged and ensure the
helper is invoked before log_line is set.
| strip_inline_comment() { | ||
| local value="${1:-}" | ||
| local first_char="${value:0:1}" | ||
| if [[ "${first_char}" == "\"" || "${first_char}" == "'" ]]; then | ||
| printf '%s' "${value}" | ||
| return | ||
| fi | ||
| value="${value%%#*}" | ||
| printf '%s' "${value}" |
There was a problem hiding this comment.
Inline comment stripping corrupts quoted TOML values with leading whitespace.
At Line 66, quote detection happens before trimming. A value like key = "a#b" is treated as unquoted and truncated at #.
Proposed fix
strip_inline_comment() {
- local value="${1:-}"
+ local value
+ value="$(trim_space "${1:-}")"
local first_char="${value:0:1}"
if [[ "${first_char}" == "\"" || "${first_char}" == "'" ]]; then
printf '%s' "${value}"
return
fi
- value="${value%%#*}"
- printf '%s' "${value}"
+ printf '%s' "${value%%#*}"
}
normalize_toml_scalar() {
local value
- value="$(trim_space "$(strip_inline_comment "${1:-}")")"
+ value="$(strip_inline_comment "${1:-}")"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/config.sh` around lines 63 - 71, strip_inline_comment currently checks
the first character before trimming leading whitespace, so values like key =
"a#b" get treated as unquoted and truncated; change the logic in
strip_inline_comment to detect the first non-whitespace character (i.e., compute
the first non-space char of local value and use that for quote detection) and if
that char is a double or single quote, return the original value (preserving
leading whitespace), otherwise proceed to strip inline comments with
value="${value%%#*}" as before; refer to the strip_inline_comment function and
the local value/first_char variables when making this change.
| config_detect_file() { | ||
| local system_cfg="${BOX_CONFIG_FILE:-${BOX_ETC_DIR_DEFAULT}/box.toml}" | ||
| local dev_cfg="${BOX_REPO_ROOT}/etc/box/box.toml" | ||
|
|
||
| if [[ -f "${system_cfg}" ]]; then | ||
| BOX_CONFIG_FILE="${system_cfg}" | ||
| BOX_CONFIG_SOURCE="system" | ||
| return 0 | ||
| fi | ||
|
|
||
| if [[ -f "${dev_cfg}" ]]; then | ||
| BOX_CONFIG_FILE="${dev_cfg}" | ||
| BOX_CONFIG_SOURCE="dev-fallback" | ||
| return 0 | ||
| fi |
There was a problem hiding this comment.
Do not silently fall back to dev config when BOX_CONFIG_FILE is explicitly set.
At Line 128, a missing explicit config can still fall through to dev_cfg, which can load an unintended profile.
Proposed fix
config_detect_file() {
- local system_cfg="${BOX_CONFIG_FILE:-${BOX_ETC_DIR_DEFAULT}/box.toml}"
+ local explicit_cfg="${BOX_CONFIG_FILE:-}"
+ local system_cfg="${BOX_ETC_DIR_DEFAULT}/box.toml"
local dev_cfg="${BOX_REPO_ROOT}/etc/box/box.toml"
+
+ if [[ -n "${explicit_cfg}" ]]; then
+ if [[ -f "${explicit_cfg}" ]]; then
+ BOX_CONFIG_FILE="${explicit_cfg}"
+ BOX_CONFIG_SOURCE="explicit"
+ return 0
+ fi
+ log "ERROR" "config" "E_CONFIG_FILE" "explicit BOX_CONFIG_FILE not found: ${explicit_cfg}"
+ return 1
+ fi
if [[ -f "${system_cfg}" ]]; then
BOX_CONFIG_FILE="${system_cfg}"
BOX_CONFIG_SOURCE="system"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| config_detect_file() { | |
| local system_cfg="${BOX_CONFIG_FILE:-${BOX_ETC_DIR_DEFAULT}/box.toml}" | |
| local dev_cfg="${BOX_REPO_ROOT}/etc/box/box.toml" | |
| if [[ -f "${system_cfg}" ]]; then | |
| BOX_CONFIG_FILE="${system_cfg}" | |
| BOX_CONFIG_SOURCE="system" | |
| return 0 | |
| fi | |
| if [[ -f "${dev_cfg}" ]]; then | |
| BOX_CONFIG_FILE="${dev_cfg}" | |
| BOX_CONFIG_SOURCE="dev-fallback" | |
| return 0 | |
| fi | |
| config_detect_file() { | |
| local explicit_cfg="${BOX_CONFIG_FILE:-}" | |
| local system_cfg="${BOX_ETC_DIR_DEFAULT}/box.toml" | |
| local dev_cfg="${BOX_REPO_ROOT}/etc/box/box.toml" | |
| if [[ -n "${explicit_cfg}" ]]; then | |
| if [[ -f "${explicit_cfg}" ]]; then | |
| BOX_CONFIG_FILE="${explicit_cfg}" | |
| BOX_CONFIG_SOURCE="explicit" | |
| return 0 | |
| fi | |
| log "ERROR" "config" "E_CONFIG_FILE" "explicit BOX_CONFIG_FILE not found: ${explicit_cfg}" | |
| return 1 | |
| fi | |
| if [[ -f "${system_cfg}" ]]; then | |
| BOX_CONFIG_FILE="${system_cfg}" | |
| BOX_CONFIG_SOURCE="system" | |
| return 0 | |
| fi | |
| if [[ -f "${dev_cfg}" ]]; then | |
| BOX_CONFIG_FILE="${dev_cfg}" | |
| BOX_CONFIG_SOURCE="dev-fallback" | |
| return 0 | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/config.sh` around lines 127 - 141, In config_detect_file(), don't allow
an explicitly set BOX_CONFIG_FILE to be ignored by falling back to dev_cfg:
first check if BOX_CONFIG_FILE is non-empty and if the referenced file exists—if
it does, set BOX_CONFIG_SOURCE (e.g., "env" or "explicit") and return 0; if
BOX_CONFIG_FILE is non-empty but the file does not exist, fail fast (log an
error and return non-zero) instead of continuing to check dev_cfg; otherwise
continue with the existing logic that checks system_cfg and dev_cfg. Ensure you
reference the BOX_CONFIG_FILE variable and the config_detect_file function when
making the change so the intent is clear.
| service_restart() { | ||
| service_stop | ||
| service_start | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Potential race window between stop and start in restart.
The restart calls service_stop and service_start separately, each acquiring/releasing the "service" lock independently. This creates a brief window where another process could interfere. For CLI use this is likely acceptable, but for systemd-triggered restarts or concurrent access, consider a locked atomic restart.
💡 Alternative: atomic locked restart
service_restart_locked() {
service_stop_locked
service_start_locked
}
service_restart() {
with_lock "service" 30 service_restart_locked
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/supervisor/supervisor.sh` around lines 228 - 231, The restart currently
calls service_stop and service_start separately, leaving a race window; change
restart to perform an atomic locked restart by adding a helper
service_restart_locked that calls service_stop_locked then service_start_locked,
and update service_restart to call with_lock "service" 30 service_restart_locked
(or alternatively wrap the existing service_stop and service_start calls
together inside with_lock "service" 30) so the stop/start execute under a single
lock.
| ## Quick run (dev) | ||
| 1. Use repo-local fallback config at `etc/box/box.toml` (auto-used when `/etc/box/box.toml` is missing). | ||
| 2. Run status commands: | ||
| - `./cmd/boxctl service status` | ||
| - `./cmd/boxctl service status --json` | ||
| - `./cmd/boxctl firewall status` | ||
| - `./cmd/boxctl firewall status --json` | ||
| 3. Run privileged actions as root: | ||
| - `sudo ./cmd/boxctl service start|stop|restart` | ||
| - `sudo ./cmd/boxctl firewall enable|disable|renew` | ||
| 4. Run integration checks: | ||
| - `./tests/integration/test_phase2.sh` | ||
|
|
||
| ## Systemd units | ||
| - `systemd/box.service` | ||
| - `systemd/box-firewall.service` | ||
|
|
||
| Copy/symlink these to your systemd unit path and ensure `boxctl` is installed as `/usr/bin/boxctl`. | ||
|
|
||
| ## Phase 2 notes | ||
| - Supported cores: `mihomo`, `sing-box` | ||
| - Core overlay mutators render runtime configs under `/run/box/rendered` (or dev fallback). | ||
| - Firewall backend (`iptables`) supports staged apply + rollback for `tun`, `tproxy`, `redirect`, `mixed`, `enhance`. | ||
| - DNS strategy handling: `tproxy`, `redirect`, `disable`. | ||
| - Tailscale coexistence defaults to `dns_coexist_mode=preserve_tailnet`. | ||
| - Coexistence mode semantics: | ||
| - `preserve_tailnet`: apply tailscale bypass (`tailscale0`, `100.64.0.0/10`) and MagicDNS resolver exclusion (`100.100.100.100:53`). | ||
| - `strict_box`: do not add tailscale bypass/MagicDNS exclusion rules; still never delete non-BOX routes/rules. | ||
| - Tailscale safeguards include: | ||
| - bypass `tailscale0` | ||
| - bypass `100.64.0.0/10` and preserve `fd7a:115c:a1e0::/48` by not touching ip6tables in this backend | ||
| - bypass `100.100.100.100:53` (MagicDNS resolver) | ||
| - preserve table `52` / fwmark `0x80000/0xff0000` ownership | ||
| - Route convergence: renew/reapply prunes stale BOX fwmark rules for the same fwmark+table (any old pref) and installs exactly one rule with current `route_pref`. | ||
| - `enable|renew|disable` paths are idempotent and lock-protected. | ||
|
|
||
| ## Remaining TODO | ||
| - Full UID/GID/interface/MAC policy graph in firewall (currently placeholder stage). | ||
| - `nftables` backend. | ||
| - API-based reload hooks for `mihomo` and `sing-box`. |
There was a problem hiding this comment.
Fix heading spacing to satisfy markdownlint MD022.
Line 14, Line 27, Line 33, and Line 50 headings should be followed by a blank line.
📝 Proposed markdown fix
## Quick run (dev)
+
1. Use repo-local fallback config at `etc/box/box.toml` (auto-used when `/etc/box/box.toml` is missing).
2. Run status commands:
@@
## Systemd units
+
- `systemd/box.service`
- `systemd/box-firewall.service`
@@
## Phase 2 notes
+
- Supported cores: `mihomo`, `sing-box`
- Core overlay mutators render runtime configs under `/run/box/rendered` (or dev fallback).
@@
## Remaining TODO
+
- Full UID/GID/interface/MAC policy graph in firewall (currently placeholder stage).
- `nftables` backend.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ## Quick run (dev) | |
| 1. Use repo-local fallback config at `etc/box/box.toml` (auto-used when `/etc/box/box.toml` is missing). | |
| 2. Run status commands: | |
| - `./cmd/boxctl service status` | |
| - `./cmd/boxctl service status --json` | |
| - `./cmd/boxctl firewall status` | |
| - `./cmd/boxctl firewall status --json` | |
| 3. Run privileged actions as root: | |
| - `sudo ./cmd/boxctl service start|stop|restart` | |
| - `sudo ./cmd/boxctl firewall enable|disable|renew` | |
| 4. Run integration checks: | |
| - `./tests/integration/test_phase2.sh` | |
| ## Systemd units | |
| - `systemd/box.service` | |
| - `systemd/box-firewall.service` | |
| Copy/symlink these to your systemd unit path and ensure `boxctl` is installed as `/usr/bin/boxctl`. | |
| ## Phase 2 notes | |
| - Supported cores: `mihomo`, `sing-box` | |
| - Core overlay mutators render runtime configs under `/run/box/rendered` (or dev fallback). | |
| - Firewall backend (`iptables`) supports staged apply + rollback for `tun`, `tproxy`, `redirect`, `mixed`, `enhance`. | |
| - DNS strategy handling: `tproxy`, `redirect`, `disable`. | |
| - Tailscale coexistence defaults to `dns_coexist_mode=preserve_tailnet`. | |
| - Coexistence mode semantics: | |
| - `preserve_tailnet`: apply tailscale bypass (`tailscale0`, `100.64.0.0/10`) and MagicDNS resolver exclusion (`100.100.100.100:53`). | |
| - `strict_box`: do not add tailscale bypass/MagicDNS exclusion rules; still never delete non-BOX routes/rules. | |
| - Tailscale safeguards include: | |
| - bypass `tailscale0` | |
| - bypass `100.64.0.0/10` and preserve `fd7a:115c:a1e0::/48` by not touching ip6tables in this backend | |
| - bypass `100.100.100.100:53` (MagicDNS resolver) | |
| - preserve table `52` / fwmark `0x80000/0xff0000` ownership | |
| - Route convergence: renew/reapply prunes stale BOX fwmark rules for the same fwmark+table (any old pref) and installs exactly one rule with current `route_pref`. | |
| - `enable|renew|disable` paths are idempotent and lock-protected. | |
| ## Remaining TODO | |
| - Full UID/GID/interface/MAC policy graph in firewall (currently placeholder stage). | |
| - `nftables` backend. | |
| - API-based reload hooks for `mihomo` and `sing-box`. | |
| ## Quick run (dev) | |
| 1. Use repo-local fallback config at `etc/box/box.toml` (auto-used when `/etc/box/box.toml` is missing). | |
| 2. Run status commands: | |
| - `./cmd/boxctl service status` | |
| - `./cmd/boxctl service status --json` | |
| - `./cmd/boxctl firewall status` | |
| - `./cmd/boxctl firewall status --json` | |
| 3. Run privileged actions as root: | |
| - `sudo ./cmd/boxctl service start|stop|restart` | |
| - `sudo ./cmd/boxctl firewall enable|disable|renew` | |
| 4. Run integration checks: | |
| - `./tests/integration/test_phase2.sh` | |
| ## Systemd units | |
| - `systemd/box.service` | |
| - `systemd/box-firewall.service` | |
| Copy/symlink these to your systemd unit path and ensure `boxctl` is installed as `/usr/bin/boxctl`. | |
| ## Phase 2 notes | |
| - Supported cores: `mihomo`, `sing-box` | |
| - Core overlay mutators render runtime configs under `/run/box/rendered` (or dev fallback). | |
| - Firewall backend (`iptables`) supports staged apply + rollback for `tun`, `tproxy`, `redirect`, `mixed`, `enhance`. | |
| - DNS strategy handling: `tproxy`, `redirect`, `disable`. | |
| - Tailscale coexistence defaults to `dns_coexist_mode=preserve_tailnet`. | |
| - Coexistence mode semantics: | |
| - `preserve_tailnet`: apply tailscale bypass (`tailscale0`, `100.64.0.0/10`) and MagicDNS resolver exclusion (`100.100.100.100:53`). | |
| - `strict_box`: do not add tailscale bypass/MagicDNS exclusion rules; still never delete non-BOX routes/rules. | |
| - Tailscale safeguards include: | |
| - bypass `tailscale0` | |
| - bypass `100.64.0.0/10` and preserve `fd7a:115c:a1e0::/48` by not touching ip6tables in this backend | |
| - bypass `100.100.100.100:53` (MagicDNS resolver) | |
| - preserve table `52` / fwmark `0x80000/0xff0000` ownership | |
| - Route convergence: renew/reapply prunes stale BOX fwmark rules for the same fwmark+table (any old pref) and installs exactly one rule with current `route_pref`. | |
| - `enable|renew|disable` paths are idempotent and lock-protected. | |
| ## Remaining TODO | |
| - Full UID/GID/interface/MAC policy graph in firewall (currently placeholder stage). | |
| - `nftables` backend. | |
| - API-based reload hooks for `mihomo` and `sing-box`. |
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 14-14: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 27-27: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 33-33: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 50-50: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@README.md` around lines 14 - 53, The Markdown headings "Quick run (dev)",
"Systemd units", "Phase 2 notes", and "Remaining TODO" each lack the required
blank line after the heading; update README.md by inserting a single blank line
immediately after those heading lines so that the content following each heading
is separated (ensure the lines containing those exact headings are followed by
an empty line).
tests/fixtures/mockbin/ip
Outdated
| if [[ "${*}" == *" table "* ]]; then | ||
| table="${*: -1}" | ||
| fi |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's locate and examine the file
find . -path "*/tests/fixtures/mockbin/ip" -type fRepository: JSJ-Experiments/box4linux
Length of output: 96
🏁 Script executed:
# Read the file to understand the context
cat -n tests/fixtures/mockbin/ip | head -80Repository: JSJ-Experiments/box4linux
Length of output: 2011
🌐 Web query:
Linux ip command table argument syntax and examples
💡 Result:
ip route ... table (routing table to modify/show)
Syntax (from ip-route(8)):
ip route {show|flush} ... table TABLE_IDip route {add|del|change|append|replace} ... [table TABLE_ID]TABLE_ID := local | main | default | all | NUMBER(often you’ll use a number or a name mapped to a number) [1]
Common examples
# Show routes in a specific table
ip route show table 100
# Add a default route to a custom table
ip route add default via 192.0.2.1 dev eth0 table 100
# Add a connected route to a custom table
ip route add 10.10.0.0/16 dev eth1 scope link table 100
# Delete a route from a specific table (table is part of the “key”)
ip route del 10.10.0.0/16 dev eth1 table 100
# Flush (delete) all routes from a table
ip route flush table 100ip rule ... lookup/table (policy rule to select a routing table)
Syntax (from ip-rule(8)):
ip rule add ... table TABLEIDip rule add ... lookup TABLEID(same meaning;lookupcan be used instead oftable) [2]
Common examples
# If traffic comes from 10.10.10.0/24, route it using table 100
ip rule add from 10.10.10.0/24 lookup 100
# Route packets with firewall mark 0x11 using table 100
ip rule add fwmark 0x11 lookup 100
# Show rules (to confirm which tables are consulted, and in what priority order)
ip rule showUsing table names (instead of numbers)
You can define a name→number mapping in /etc/iproute2/rt_tables, then use the name anywhere you’d use the numeric table ID (e.g., table ISP2 / lookup ISP2). [3]
Example:
echo "100 ISP2" | sudo tee -a /etc/iproute2/rt_tables
ip route add default via 192.0.2.1 dev eth0 table ISP2
ip rule add from 10.10.10.0/24 lookup ISP2
ip route show table ISP2Sources: ip-route(8) [1], ip-rule(8) [2], iproute2 guide (rt_tables, policy routing examples) [3].
Table extraction is fragile in route:add / route:del.
The current logic assumes the table ID is always the last token by using ${*: -1}. However, the real ip route command allows trailing arguments after the table specification (e.g., ip route add 10.0.0.0/24 via 192.168.1.1 table 100 dev eth0). This causes the mock to extract the wrong table ID and record route state under an incorrect table, degrading test fidelity.
Proposed fix
+extract_table_arg() {
+ local default_table="${1}"
+ shift
+ local prev=""
+ local token
+ for token in "$@"; do
+ if [[ "${prev}" == "table" ]]; then
+ printf '%s\n' "${token}"
+ return 0
+ fi
+ prev="${token}"
+ done
+ printf '%s\n' "${default_table}"
+}
+
route:add)
route="$*"
- table="main"
- if [[ "${*}" == *" table "* ]]; then
- table="${*: -1}"
- fi
+ table="$(extract_table_arg "main" "$@")"
@@
route:del)
route="$*"
- table="main"
- if [[ "${*}" == *" table "* ]]; then
- table="${*: -1}"
- fi
+ table="$(extract_table_arg "main" "$@")"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/fixtures/mockbin/ip` around lines 42 - 44, The table extraction is
fragile because it assumes the table ID is the last token (table="${*: -1}");
replace this with logic that scans the arguments for the literal token "table"
and uses the following token as the table ID. For example, split "$@" into an
array (tokens=("$@")), loop over tokens and when tokens[i] == "table" set
table="${tokens[i+1]}" (with a guard that i+1 exists) so commands like 'ip route
add ... table 100 dev eth0' correctly pick up 100; update the conditional that
currently checks [[ "${*}" == *" table "* ]] to use this scan and only set table
when a following token is present.
| -X) | ||
| chain="${args[1]}" | ||
| grep -Fv "CHAIN|${table}|${chain}" "${STATE_FILE}" >"${STATE_FILE}.tmp" || true | ||
| grep -Fv "RULE|${table}|${chain}|" "${STATE_FILE}.tmp" >"${STATE_FILE}.tmp2" || true | ||
| mv "${STATE_FILE}.tmp2" "${STATE_FILE}" | ||
| rm -f "${STATE_FILE}.tmp" | ||
| ;; | ||
| -F) | ||
| chain="${args[1]}" | ||
| grep -Fv "RULE|${table}|${chain}|" "${STATE_FILE}" >"${STATE_FILE}.tmp" || true | ||
| mv "${STATE_FILE}.tmp" "${STATE_FILE}" | ||
| ;; | ||
| -A) | ||
| chain="${args[1]}" | ||
| rule="${args[*]:2}" | ||
| if ! rule_exists "${chain}" "${rule}"; then | ||
| printf 'RULE|%s|%s|%s\n' "${table}" "${chain}" "${rule}" >>"${STATE_FILE}" | ||
| fi | ||
| ;; | ||
| -C) | ||
| chain="${args[1]}" | ||
| rule="${args[*]:2}" | ||
| rule_exists "${chain}" "${rule}" | ||
| ;; | ||
| -D) | ||
| chain="${args[1]}" | ||
| rule="${args[*]:2}" | ||
| grep -Fv "RULE|${table}|${chain}|${rule}" "${STATE_FILE}" >"${STATE_FILE}.tmp" || true | ||
| mv "${STATE_FILE}.tmp" "${STATE_FILE}" | ||
| ;; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd /tmp && find . -name "iptables" -path "*/fixtures/mockbin/*" 2>/dev/null | head -5Repository: JSJ-Experiments/box4linux
Length of output: 51
🏁 Script executed:
# List repository structure to locate the file
git ls-files | grep -i "iptables"Repository: JSJ-Experiments/box4linux
Length of output: 133
🏁 Script executed:
# Read the full iptables mock file to understand context and helper functions
cat -n tests/fixtures/mockbin/iptablesRepository: JSJ-Experiments/box4linux
Length of output: 3042
🌐 Web query:
iptables -X -F -A -D behavior on nonexistent chain error semantics
💡 Result:
What happens if the chain doesn’t exist?
For iptables commands that take a chain name, the chain must already exist in the selected table. If it doesn’t, iptables fails with the diagnostic:
iptables: No chain/target/match by that name.(and a non-zero exit status, commonly1) [2][4].
This applies to:
-A CHAIN ...(append): appends to “the selected chain” ⇒ chain must exist [1]. If not, you get No chain/target/match... [3][4].-D CHAIN ...(delete rule): deletes rule(s) from “the selected chain” ⇒ chain must exist [1]. If not, you get No chain/target/match... [4].-F [CHAIN](flush): flushes the selected chain (or all chains in the table if omitted) [1]. If you pass a chain name that doesn’t exist, you get No chain/target/match... [2].-X [CHAIN](delete chain): deletes the specified chain; additionally it must be empty and unreferenced [1]. If the chain doesn’t exist, you get No chain/target/match... (same class of failure) [2].
Related (often confused) case: chain exists, but the rule doesn’t
If the chain exists but -D can’t find a matching rule, the common error is:
iptables: Bad rule (does a matching rule exist in that chain?).[5]
Sources: iptables(8) man page (Rocky Linux 9.2) [1], Ask Ubuntu example [2], NXP community note on missing chains [3], Stack Overflow example showing exit status 1 for missing chain [4], Stack Overflow threads showing “Bad rule …” wording [5].
Mock should reject invalid chain/rule mutations to match real iptables.
-A, -F, -D, and -X currently succeed on non-existent chains, whereas real iptables exits with status 1 and outputs "No chain/target/match by that name." This allows integration tests to pass when actual firewall commands would fail, masking configuration bugs.
Proposed fix
-X)
chain="${args[1]}"
+ if ! chain_exists "${chain}"; then
+ exit 1
+ fi
grep -Fv "CHAIN|${table}|${chain}" "${STATE_FILE}" >"${STATE_FILE}.tmp" || true
@@
-F)
chain="${args[1]}"
+ if ! chain_exists "${chain}"; then
+ exit 1
+ fi
grep -Fv "RULE|${table}|${chain}|" "${STATE_FILE}" >"${STATE_FILE}.tmp" || true
@@
-A)
chain="${args[1]}"
rule="${args[*]:2}"
+ if ! chain_exists "${chain}"; then
+ exit 1
+ fi
if ! rule_exists "${chain}" "${rule}"; then
printf 'RULE|%s|%s|%s\n' "${table}" "${chain}" "${rule}" >>"${STATE_FILE}"
fi
@@
-D)
chain="${args[1]}"
rule="${args[*]:2}"
+ if ! chain_exists "${chain}" || ! rule_exists "${chain}" "${rule}"; then
+ exit 1
+ fi
grep -Fv "RULE|${table}|${chain}|${rule}" "${STATE_FILE}" >"${STATE_FILE}.tmp" || true🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/fixtures/mockbin/iptables` around lines 62 - 91, Handlers for iptables
options (-A, -F, -D, -X, and -C) must validate that the target chain exists and
return the real iptables failure when it does not; before performing mutations
in the -A, -F, -D and -X blocks call a chain existence check (e.g., chain_exists
"${chain}") and if it fails print to stderr "No chain/target/match by that
name." and exit 1, and for -C ensure it returns non-zero and emits the same
message when the chain is missing (while keeping rule_exists for rule presence
checks); use STATE_FILE/table/chain variables as in the diff and ensure failures
use exit status 1 rather than silently succeeding.
| assert_magicdns_bypass_rules() { | ||
| if ! grep -Fq 'RULE|mangle|BOX_DNS_MANGLE|-d 100.100.100.100 -p udp --dport 53 -j RETURN' "${MOCK_IPTABLES_STATE}"; then | ||
| printf 'ASSERT MAGICDNS FAILED: UDP resolver bypass missing\n' >&2 | ||
| cat "${MOCK_IPTABLES_STATE}" >&2 | ||
| exit 1 | ||
| fi | ||
| if ! grep -Fq 'RULE|nat|BOX_DNS_NAT|-d 100.100.100.100 -p udp --dport 53 -j RETURN' "${MOCK_IPTABLES_STATE}"; then | ||
| printf 'ASSERT MAGICDNS FAILED: NAT resolver bypass missing\n' >&2 | ||
| cat "${MOCK_IPTABLES_STATE}" >&2 | ||
| exit 1 | ||
| fi | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Extend MagicDNS bypass assertions to cover TCP as well as UDP.
Current checks only validate UDP rules, so TCP bypass regressions would pass unnoticed.
🧪 Proposed test hardening
assert_magicdns_bypass_rules() {
if ! grep -Fq 'RULE|mangle|BOX_DNS_MANGLE|-d 100.100.100.100 -p udp --dport 53 -j RETURN' "${MOCK_IPTABLES_STATE}"; then
@@
if ! grep -Fq 'RULE|nat|BOX_DNS_NAT|-d 100.100.100.100 -p udp --dport 53 -j RETURN' "${MOCK_IPTABLES_STATE}"; then
@@
+ if ! grep -Fq 'RULE|mangle|BOX_DNS_MANGLE|-d 100.100.100.100 -p tcp --dport 53 -j RETURN' "${MOCK_IPTABLES_STATE}"; then
+ printf 'ASSERT MAGICDNS FAILED: TCP resolver bypass missing\n' >&2
+ cat "${MOCK_IPTABLES_STATE}" >&2
+ exit 1
+ fi
+ if ! grep -Fq 'RULE|nat|BOX_DNS_NAT|-d 100.100.100.100 -p tcp --dport 53 -j RETURN' "${MOCK_IPTABLES_STATE}"; then
+ printf 'ASSERT MAGICDNS FAILED: NAT TCP resolver bypass missing\n' >&2
+ cat "${MOCK_IPTABLES_STATE}" >&2
+ exit 1
+ fi
}
@@
if grep -Fq 'RULE|mangle|BOX_MANGLE|-i tailscale0 -j RETURN' "${MOCK_IPTABLES_STATE}" || \
grep -Fq 'RULE|mangle|BOX_MANGLE|-m mark --mark 0x80000/0xff0000 -j RETURN' "${MOCK_IPTABLES_STATE}" || \
grep -Fq 'RULE|mangle|BOX_DNS_MANGLE|-d 100.100.100.100 -p udp --dport 53 -j RETURN' "${MOCK_IPTABLES_STATE}" || \
- grep -Fq 'RULE|nat|BOX_DNS_NAT|-d 100.100.100.100 -p udp --dport 53 -j RETURN' "${MOCK_IPTABLES_STATE}"; then
+ grep -Fq 'RULE|nat|BOX_DNS_NAT|-d 100.100.100.100 -p udp --dport 53 -j RETURN' "${MOCK_IPTABLES_STATE}" || \
+ grep -Fq 'RULE|mangle|BOX_DNS_MANGLE|-d 100.100.100.100 -p tcp --dport 53 -j RETURN' "${MOCK_IPTABLES_STATE}" || \
+ grep -Fq 'RULE|nat|BOX_DNS_NAT|-d 100.100.100.100 -p tcp --dport 53 -j RETURN' "${MOCK_IPTABLES_STATE}"; thenAlso applies to: 199-204
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/integration/test_phase2.sh` around lines 168 - 179,
assert_magicdns_bypass_rules currently only checks for UDP bypass rules; update
it (and the duplicate checks around the other occurrence) to also assert TCP
rules exist. Specifically, in the function assert_magicdns_bypass_rules add two
grep assertions mirroring the UDP ones but matching '-p tcp --dport 53 -j
RETURN' for both the mangle (RULE|mangle|BOX_DNS_MANGLE) and nat
(RULE|nat|BOX_DNS_NAT) entries so TCP resolver bypass rules are validated as
well.
…stence hardening tests
…el firewall validation
Summary by CodeRabbit
Release Notes
New Features
Documentation
Tests