fix(security): require explicit trust for local scan config#714
fix(security): require explicit trust for local scan config#714
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughLocal Changes
Sequence DiagramsequenceDiagram
participant User as User
participant CLI as CLI
participant Finder as LocalConfigFinder
participant Store as TrustedConfigStore
participant Loader as ConfigLoader
User->>CLI: run scan
CLI->>Finder: find_local_config_for_paths(paths)
Finder-->>CLI: LocalConfigCandidate / None
alt candidate found
CLI->>Store: is_trusted(candidate)
Store-->>CLI: true / false
alt already trusted
CLI->>Loader: load and apply local config
Loader-->>CLI: ModelAuditConfig
else not trusted
alt interactive session
CLI->>User: prompt to trust config?
User-->>CLI: yes / no
alt yes
CLI->>Store: trust(candidate)
Store-->>CLI: stored
CLI->>Loader: load and apply local config
Loader-->>CLI: ModelAuditConfig
else no
CLI->>Loader: use default config
Loader-->>CLI: ModelAuditConfig
end
else non-interactive
CLI->>Loader: use default config
Loader-->>CLI: ModelAuditConfig
end
end
else no candidate
CLI->>Loader: use default config
Loader-->>CLI: ModelAuditConfig
end
CLI->>User: run scan with chosen config
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/cache/trusted_config_store.py`:
- Around line 142-153: Replace the manual parent-walk in _has_symlink_component
with a simpler resolve-based check: use Path.resolve(strict=False) and return
True when path != path.resolve(strict=False) (this detects any symlink component
anywhere in the path while avoiding raising on broken symlinks); update the
function to call Path.resolve(strict=False) on the input Path and return the
comparison result instead of the loop and OSError handling.
In `@tests/test_cli.py`:
- Around line 179-205: The test function
test_scan_can_apply_local_config_once_when_confirmed has untyped parameters; add
explicit type annotations to match repo rules by importing Path from pathlib and
annotating tmp_path: Path and monkeypatch: pytest.MonkeyPatch (ensure pytest is
imported or referenced) and keep the existing return type -> None; apply the
same parameter annotation fixes to the other new/nearby test functions mentioned
in the comment to ensure all tests in this diff use tmp_path: Path and
monkeypatch: pytest.MonkeyPatch.
- Around line 179-205: The test must also assert that choosing "use once" did
not persist trust; after invoking the CLI in
test_scan_can_apply_local_config_once_when_confirmed, instantiate the same
TrustedConfigStore used in the monkeypatch (TrustedConfigStore(tmp_path /
"cache" / "trusted_local_configs.json")) or read that JSON file and assert it
either does not exist or contains no trusted entries for the model path,
ensuring the one-time branch didn't write a trusted-config record.
In `@tests/test_rules.py`:
- Around line 190-198: The test method
test_from_cli_args_uses_provided_base_config is missing the return type
annotation; update its signature to include -> None (i.e., def
test_from_cli_args_uses_provided_base_config(self) -> None:) so it follows the
repo's typing convention for tests; locate and edit the method declaration in
tests/test_rules.py to add the annotation without changing the body or
assertions.
- Around line 246-259: The test only passes a single path so it never exercises
the multi-path branch of find_local_config_for_paths; update the
test_find_local_config_for_paths_uses_shared_ancestor to create a second file
under a different subdirectory of the same repo root (e.g., create another
nested folder and file alongside nested/model.pkl), call
find_local_config_for_paths with both file paths, and assert the returned
candidate is not None and still points to the same config_dir and config_path
(root and root / ".modelaudit.toml") to validate the shared-ancestor merging
logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6d033ff6-bb5a-4b45-8ff7-01bde0fda348
📒 Files selected for processing (8)
CHANGELOG.mddocs/user/security-model.mdmodelaudit/cache/trusted_config_store.pymodelaudit/cli.pymodelaudit/config/local_config.pymodelaudit/config/rule_config.pytests/test_cli.pytests/test_rules.py
| def _has_symlink_component(path: Path) -> bool: | ||
| """Return True when path or an ancestor is a symlink.""" | ||
| current = path | ||
| while True: | ||
| try: | ||
| if current.exists() and current.is_symlink(): | ||
| return True | ||
| except OSError: | ||
| return True | ||
| if current == current.parent: | ||
| return False | ||
| current = current.parent |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider simplifying symlink detection.
The _has_symlink_component function manually walks parent directories to detect symlinks. This is functionally correct and fail-safe (returns True on OSError).
An alternative approach would be to compare path.resolve() with the original path, which would detect symlinks anywhere in the path. However, the current implementation is more explicit and handles edge cases like broken symlinks more predictably.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modelaudit/cache/trusted_config_store.py` around lines 142 - 153, Replace the
manual parent-walk in _has_symlink_component with a simpler resolve-based check:
use Path.resolve(strict=False) and return True when path !=
path.resolve(strict=False) (this detects any symlink component anywhere in the
path while avoiding raising on broken symlinks); update the function to call
Path.resolve(strict=False) on the input Path and return the comparison result
instead of the loop and OSError handling.
| def test_find_local_config_for_paths_uses_shared_ancestor(tmp_path: Path) -> None: | ||
| """All scanned paths under the same config root should resolve one candidate.""" | ||
| root = tmp_path / "repo" | ||
| nested = root / "models" / "nested" | ||
| nested.mkdir(parents=True) | ||
| (root / ".modelaudit.toml").write_text('suppress = ["S710"]\n') | ||
| file_path = nested / "model.pkl" | ||
| file_path.write_bytes(b"test") | ||
|
|
||
| candidate = find_local_config_for_paths([str(file_path)]) | ||
|
|
||
| assert candidate is not None | ||
| assert candidate.config_dir == root | ||
| assert candidate.config_path == root / ".modelaudit.toml" |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Exercise the multi-path branch in the shared-ancestor test.
With a single file, this only proves upward resolution. The equality check across multiple resolved candidates stays untested, so a regression in the actual shared-root comparison would slip through.
🧪 Cover the real shared-ancestor case
- file_path = nested / "model.pkl"
- file_path.write_bytes(b"test")
+ first = nested / "model-one.pkl"
+ second = nested / "model-two.pkl"
+ first.write_bytes(b"one")
+ second.write_bytes(b"two")
- candidate = find_local_config_for_paths([str(file_path)])
+ candidate = find_local_config_for_paths([str(first), str(second)])🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_rules.py` around lines 246 - 259, The test only passes a single
path so it never exercises the multi-path branch of find_local_config_for_paths;
update the test_find_local_config_for_paths_uses_shared_ancestor to create a
second file under a different subdirectory of the same repo root (e.g., create
another nested folder and file alongside nested/model.pkl), call
find_local_config_for_paths with both file paths, and assert the returned
candidate is not None and still points to the same config_dir and config_path
(root and root / ".modelaudit.toml") to validate the shared-ancestor merging
logic.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelaudit/cache/trusted_config_store.py`:
- Around line 29-30: Add an explicit return type hint to the TrustedConfigStore
constructor: update the TrustedConfigStore.__init__ signature to annotate the
return type as -> None (e.g., def __init__(self, store_path: Path | None = None)
-> None:), leaving the body unchanged so it complies with the repo typing rule.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 55d6d1d1-50a0-4b0d-91f2-694a8ebd823b
📒 Files selected for processing (3)
modelaudit/cache/trusted_config_store.pytests/test_cli.pytests/test_rules.py
| def __init__(self, store_path: Path | None = None): | ||
| self.store_path = store_path or (Path.home() / ".modelaudit" / "cache" / "trusted_local_configs.json") |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add the missing constructor return type hint.
TrustedConfigStore.__init__ should explicitly annotate -> None to comply with the repo typing rule.
♻️ Proposed fix
-class TrustedConfigStore:
+class TrustedConfigStore:
@@
- def __init__(self, store_path: Path | None = None):
+ def __init__(self, store_path: Path | None = None) -> None:
self.store_path = store_path or (Path.home() / ".modelaudit" / "cache" / "trusted_local_configs.json")As per coding guidelines, **/*.py: Always include type hints in Python code.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modelaudit/cache/trusted_config_store.py` around lines 29 - 30, Add an
explicit return type hint to the TrustedConfigStore constructor: update the
TrustedConfigStore.__init__ signature to annotate the return type as -> None
(e.g., def __init__(self, store_path: Path | None = None) -> None:), leaving the
body unchanged so it complies with the repo typing rule.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_cli.py`:
- Around line 208-239: Add a new test alongside
test_scan_can_remember_trusted_local_config that verifies trusted-local-config
is invalidated when the .modelaudit.toml changes: reuse TrustedConfigStore and
CliRunner, perform an initial trusted run (like calling runner.invoke(cli,
["scan", str(model_file), "--format", "text"], input="a\n") with monkeypatched
can_use_trusted_local_config and get_trusted_config_store), then mutate the
.modelaudit.toml file contents (e.g., append or modify text) before invoking cli
a second time and assert the second run no longer silently uses the trusted
config (i.e., the interactive prompt or "Found local ModelAudit config" message
appears on the second run), and also assert the trust_store.store_path still
exists but the trust decision was invalidated after the file change; reference
test_scan_can_remember_trusted_local_config, TrustedConfigStore, runner.invoke,
and .modelaudit.toml when locating where to add this test.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: fc5aa424-a75a-450e-a759-d9cab3cf152d
📒 Files selected for processing (1)
tests/test_cli.py
| def test_scan_can_remember_trusted_local_config(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None: | ||
| """Choosing to trust a local config should persist for future interactive runs.""" | ||
| import tarfile | ||
|
|
||
| model_file = tmp_path / "evil.tar" | ||
| with tarfile.open(model_file, "w") as tar: | ||
| payload = tmp_path / "payload.txt" | ||
| payload.write_text("content") | ||
| tar.add(payload, arcname="../evil.txt") | ||
|
|
||
| (tmp_path / ".modelaudit.toml").write_text('suppress = ["S405"]\n') | ||
| trust_store = TrustedConfigStore(tmp_path / "cache" / "trusted_local_configs.json") | ||
|
|
||
| monkeypatch.chdir(tmp_path) | ||
| monkeypatch.setattr("modelaudit.cli.can_use_trusted_local_config", lambda output_format: output_format == "text") | ||
| monkeypatch.setattr("modelaudit.cli.get_trusted_config_store", lambda: trust_store) | ||
|
|
||
| runner = CliRunner() | ||
| first_result = runner.invoke( | ||
| cli, | ||
| ["scan", str(model_file), "--format", "text"], | ||
| input="a\n", | ||
| catch_exceptions=False, | ||
| ) | ||
| second_result = runner.invoke(cli, ["scan", str(model_file), "--format", "text"], catch_exceptions=False) | ||
|
|
||
| assert first_result.exit_code == 0 | ||
| assert second_result.exit_code == 0 | ||
| assert trust_store.store_path.exists() | ||
| assert "Found local ModelAudit config" in strip_ansi(first_result.output) | ||
| assert "Found local ModelAudit config" not in strip_ansi(second_result.output) | ||
|
|
There was a problem hiding this comment.
Add a regression test for trusted-config invalidation after file changes.
This verifies persistence, but not the required invalidation path when .modelaudit.toml changes. Please add a follow-up test that mutates the config between runs and asserts trust is no longer honored automatically.
♻️ Suggested test shape
+def test_scan_invalidates_trusted_local_config_when_config_changes(
+ tmp_path: Path, monkeypatch: pytest.MonkeyPatch
+) -> None:
+ import tarfile
+
+ model_file = tmp_path / "evil.tar"
+ with tarfile.open(model_file, "w") as tar:
+ payload = tmp_path / "payload.txt"
+ payload.write_text("content")
+ tar.add(payload, arcname="../evil.txt")
+
+ config_path = tmp_path / ".modelaudit.toml"
+ config_path.write_text('suppress = ["S405"]\n')
+ trust_store = TrustedConfigStore(tmp_path / "cache" / "trusted_local_configs.json")
+
+ monkeypatch.chdir(tmp_path)
+ monkeypatch.setattr("modelaudit.cli.can_use_trusted_local_config", lambda output_format: output_format == "text")
+ monkeypatch.setattr("modelaudit.cli.get_trusted_config_store", lambda: trust_store)
+
+ runner = CliRunner()
+ first = runner.invoke(cli, ["scan", str(model_file), "--format", "text"], input="a\n", catch_exceptions=False)
+ assert first.exit_code == 0
+
+ config_path.write_text('suppress = []\n')
+
+ second = runner.invoke(cli, ["scan", str(model_file), "--format", "text"], input="n\n", catch_exceptions=False)
+ second_output = strip_ansi(second.output)
+ assert "Found local ModelAudit config" in second_output
+ assert second.exit_code == 1🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_cli.py` around lines 208 - 239, Add a new test alongside
test_scan_can_remember_trusted_local_config that verifies trusted-local-config
is invalidated when the .modelaudit.toml changes: reuse TrustedConfigStore and
CliRunner, perform an initial trusted run (like calling runner.invoke(cli,
["scan", str(model_file), "--format", "text"], input="a\n") with monkeypatched
can_use_trusted_local_config and get_trusted_config_store), then mutate the
.modelaudit.toml file contents (e.g., append or modify text) before invoking cli
a second time and assert the second run no longer silently uses the trusted
config (i.e., the interactive prompt or "Found local ModelAudit config" message
appears on the second run), and also assert the trust_store.store_path still
exists but the trust decision was invalidated after the file change; reference
test_scan_can_remember_trusted_local_config, TrustedConfigStore, runner.invoke,
and .modelaudit.toml when locating where to add this test.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
tests/test_cli.py (1)
210-241:⚠️ Potential issue | 🟠 MajorMissing CLI regression for trust invalidation after config mutation.
This test verifies persistence, but not the required invalidation path. Between Line 233 and Line 234, mutate
.modelaudit.tomland assert the second run re-prompts (or at least re-detects) instead of silently honoring prior trust.♻️ Suggested test update
def test_scan_can_remember_trusted_local_config(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None: @@ - (tmp_path / ".modelaudit.toml").write_text('suppress = ["S405"]\n') + config_path = tmp_path / ".modelaudit.toml" + config_path.write_text('suppress = ["S405"]\n') @@ first_result = runner.invoke( cli, ["scan", str(model_file), "--format", "text"], input="a\n", catch_exceptions=False, ) - second_result = runner.invoke(cli, ["scan", str(model_file), "--format", "text"], catch_exceptions=False) + # Mutate config to invalidate remembered trust + config_path.write_text("suppress = []\n") + second_result = runner.invoke( + cli, + ["scan", str(model_file), "--format", "text"], + input="n\n", + catch_exceptions=False, + ) @@ - assert second_result.exit_code == 0 + assert second_result.exit_code == 1 assert trust_store.store_path.exists() assert "Found local ModelAudit config" in strip_ansi(first_result.output) - assert "Found local ModelAudit config" not in strip_ansi(second_result.output) + assert "Found local ModelAudit config" in strip_ansi(second_result.output)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_cli.py` around lines 210 - 241, Update the test_scan_can_remember_trusted_local_config to exercise trust invalidation: after the first runner.invoke call (the variable first_result) mutate the local config file (the .modelaudit.toml file in tmp_path) so its contents change, then run the second runner.invoke; assert that the CLI (invoked via runner.invoke on cli with "scan") no longer silently honors the prior trust by checking that the prompt/detection for "Found local ModelAudit config" appears (or that a re-prompt occurs) in the second_result output, and ensure the TrustedConfigStore (trust_store) still exists but the CLI re-evaluated the config; keep existing monkeypatch usage for can_use_trusted_local_config and get_trusted_config_store unchanged so only the config mutation triggers invalidation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_cli.py`:
- Around line 257-265: The test relies on prompting behavior tied to output
format but invokes the CLI without explicitly setting format; update the
invocation of cli in the test (the runner.invoke call that uses CliRunner and
the cli function with target_file) to pass the explicit flag "--format text" in
the argument list so can_use_trusted_local_config(output_format) sees "text"
deterministically (refer to can_use_trusted_local_config,
get_trusted_config_store, scan_model_directory_or_file, CliRunner, cli, and
target_file to locate the call).
---
Duplicate comments:
In `@tests/test_cli.py`:
- Around line 210-241: Update the test_scan_can_remember_trusted_local_config to
exercise trust invalidation: after the first runner.invoke call (the variable
first_result) mutate the local config file (the .modelaudit.toml file in
tmp_path) so its contents change, then run the second runner.invoke; assert that
the CLI (invoked via runner.invoke on cli with "scan") no longer silently honors
the prior trust by checking that the prompt/detection for "Found local
ModelAudit config" appears (or that a re-prompt occurs) in the second_result
output, and ensure the TrustedConfigStore (trust_store) still exists but the CLI
re-evaluated the config; keep existing monkeypatch usage for
can_use_trusted_local_config and get_trusted_config_store unchanged so only the
config mutation triggers invalidation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 52c5d432-0709-40de-b875-1bbb3f08d9e0
📒 Files selected for processing (1)
tests/test_cli.py
| monkeypatch.setattr("modelaudit.cli.can_use_trusted_local_config", lambda output_format: output_format == "text") | ||
| monkeypatch.setattr( | ||
| "modelaudit.cli.get_trusted_config_store", | ||
| lambda: TrustedConfigStore(tmp_path / "cache" / "trusted_local_configs.json"), | ||
| ) | ||
| monkeypatch.setattr("modelaudit.cli.scan_model_directory_or_file", fake_scan_model_directory_or_file) | ||
|
|
||
| runner = CliRunner() | ||
| result = runner.invoke(cli, ["scan", str(target_file)], input="y\n", catch_exceptions=False) |
There was a problem hiding this comment.
Make output format explicit to keep this test deterministic.
Line 257 gates trusted-config prompting to "text", but Line 265 relies on the CLI default format. Add --format text so this test remains stable if defaults change.
♻️ Suggested fix
- result = runner.invoke(cli, ["scan", str(target_file)], input="y\n", catch_exceptions=False)
+ result = runner.invoke(
+ cli,
+ ["scan", str(target_file), "--format", "text"],
+ input="y\n",
+ catch_exceptions=False,
+ )As per coding guidelines, tests/**/*.py: “Use deterministic fixtures only in tests …”.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_cli.py` around lines 257 - 265, The test relies on prompting
behavior tied to output format but invokes the CLI without explicitly setting
format; update the invocation of cli in the test (the runner.invoke call that
uses CliRunner and the cli function with target_file) to pass the explicit flag
"--format text" in the argument list so
can_use_trusted_local_config(output_format) sees "text" deterministically (refer
to can_use_trusted_local_config, get_trusted_config_store,
scan_model_directory_or_file, CliRunner, cli, and target_file to locate the
call).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CHANGELOG.md`:
- Line 84: The changelog line uses the phrase "interactive text run," which is
unclear; update the sentence in CHANGELOG.md replacing "interactive text run"
with a clearer term such as "interactive scan" or "interactive mode" (e.g.,
change the bullet text under **security:** to read "...unless a human explicitly
trusts that config in an interactive scan; remembered trust..."), keeping the
rest of the sentence intact and preserving the note about remembered trust and
invalidation when the config changes.
|
|
||
| ### Fixed | ||
|
|
||
| - **security:** stop auto-applying local `.modelaudit.toml` and `pyproject.toml` rule config during scans unless a human explicitly trusts that config in an interactive text run; remembered trust is stored securely under the local ModelAudit cache and invalidated when the config changes |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Optional: Consider clarifying "interactive text run" terminology.
The entry accurately describes the security fix. However, the phrase "interactive text run" could be more idiomatic. Consider alternatives like "interactive scan", "interactive mode", or "interactive session" for better clarity.
Example:
-- **security:** stop auto-applying local `.modelaudit.toml` and `pyproject.toml` rule config during scans unless a human explicitly trusts that config in an interactive text run; remembered trust is stored securely under the local ModelAudit cache and invalidated when the config changes
+- **security:** stop auto-applying local `.modelaudit.toml` and `pyproject.toml` rule config during scans unless a human explicitly trusts that config in an interactive scan; remembered trust is stored securely under the local ModelAudit cache and invalidated when the config changes📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - **security:** stop auto-applying local `.modelaudit.toml` and `pyproject.toml` rule config during scans unless a human explicitly trusts that config in an interactive text run; remembered trust is stored securely under the local ModelAudit cache and invalidated when the config changes | |
| - **security:** stop auto-applying local `.modelaudit.toml` and `pyproject.toml` rule config during scans unless a human explicitly trusts that config in an interactive scan; remembered trust is stored securely under the local ModelAudit cache and invalidated when the config changes |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@CHANGELOG.md` at line 84, The changelog line uses the phrase "interactive
text run," which is unclear; update the sentence in CHANGELOG.md replacing
"interactive text run" with a clearer term such as "interactive scan" or
"interactive mode" (e.g., change the bullet text under **security:** to read
"...unless a human explicitly trusts that config in an interactive scan;
remembered trust..."), keeping the rest of the sentence intact and preserving
the note about remembered trust and invalidation when the config changes.
Summary
Testing
Summary by CodeRabbit
Security Improvements
New Features
Documentation
Tests