Skip to content

fix(cli): catch corrupted credentials file at import time#282

Open
deanq wants to merge 2 commits intomainfrom
deanq/ae-2576-bug-corrupted-credentials-file-crash
Open

fix(cli): catch corrupted credentials file at import time#282
deanq wants to merge 2 commits intomainfrom
deanq/ae-2576-bug-corrupted-credentials-file-crash

Conversation

@deanq
Copy link
Member

@deanq deanq commented Mar 20, 2026

Related to runpod/runpod-python#481

Summary

  • New thin CLI entrypoint (entrypoint.py) wraps the import of runpod_flash.cli.main in a try/except to catch TOMLDecodeError from the runpod package's import-time credential read.
  • Surfaces a clean error message suggesting flash login instead of a raw traceback.
  • Console script updated in pyproject.toml: flash = "runpod_flash.cli.entrypoint:main"

Root cause: runpod/__init__.py calls get_credentials() -> toml.load() at import time. Flash's existing handler in credentials.py never runs because the crash happens during import runpod. This is defense-in-depth alongside the root cause fix in runpod-python.

Fixes: AE-2576
Depends on: runpod/runpod-python#478 (root cause fix)
Related: AE-2485 (fixed Flash's own read path)

Test plan

  • test_normal_import_runs_app -- happy path invokes Typer app
  • test_corrupted_toml_shows_clean_error -- TOMLDecodeError gives clean stderr + exit 1
  • test_non_toml_value_error_propagates -- unrelated ValueError still raises
  • All 2538 existing tests pass, 85.83% coverage

The runpod package reads ~/.runpod/config.toml at import time before
Flash error handling runs. Wrap the CLI entrypoint import in a
try/except to surface a clean error instead of a raw TOMLDecodeError
traceback.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a defensive CLI entrypoint wrapper so flash can fail gracefully when runpod crashes at import-time due to a corrupted ~/.runpod/config.toml, instead of showing a raw traceback.

Changes:

  • Introduce runpod_flash.cli.entrypoint:main to wrap importing the Typer app and catch TOML decode failures.
  • Update the console script target from runpod_flash.cli.main:app to the new wrapper entrypoint.
  • Add unit tests covering the happy path, TOML decode handling, and non-TOML ValueError propagation.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/runpod_flash/cli/entrypoint.py New thin entrypoint that catches import-time TOML decode errors and prints a clean remediation message.
pyproject.toml Points flash console script to the new entrypoint wrapper.
tests/unit/cli/test_entrypoint.py New tests validating wrapper behavior for success, TOML decode, and unrelated errors.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Address review feedback:
- Match on class name (TOMLDecodeError) and module prefix instead of
  string-matching exception messages to avoid false positives
- Update comments to reference all TOML library variants
Copy link
Contributor

@runpod-Henrik runpod-Henrik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1. Approach — correct, live-verified

The root cause (runpod/__init__.py calling toml.load() at import time before any Flash error handling can run) requires a wrapper outside the normal import chain. The thin entrypoint that catches during from runpod_flash.cli.main import app is the right level to intercept this. The PR description correctly frames this as defense-in-depth alongside the runpod-python root cause fix.

Live-verified on PR branch: corrupted config.toml produces a clean error and exits 1. Normal operation unaffected.

2. Bug: parse error location dropped — user can't identify or fix the corruption

Error: ~/.runpod/config.toml is corrupted and cannot be parsed.
Run 'flash login' to re-authenticate, or delete the file and retry.

The TOMLDecodeError includes the parse location (e.g. "Expected '=' after a key in a key/value pair (at line 1, column 6)"). The current message drops it. A user whose config is partially corrupted (e.g. a single bad line after a manual edit) can't identify or repair the file — they're told to delete it or re-authenticate regardless. Including str(exc) would help:

print(
    f"Error: ~/.runpod/config.toml is corrupted and cannot be parsed: {exc}\n"
    "Run 'flash login' to re-authenticate, or delete the file and retry.",
    file=sys.stderr,
)

3. Question: detection logic — implicit tomllib coverage

The check is:

exc_module.startswith("toml")
or exc_module.startswith("tomli")
or exc_module.startswith("tomllib")

"tomllib".startswith("toml") and "tomli".startswith("toml") are both True, so the last two branches are unreachable — startswith("toml") covers all three. Either collapse to one check, or keep all three explicitly with a comment.

Nits

  • raise SystemExit(1) from None suppresses the exception chain entirely. If a user runs flash with --debug or in a context where tracebacks are useful, the chain is gone. Acceptable for a user-facing CLI, but worth noting.
  • No test for ImportError (missing runpod package) — correctly propagates uncaught, but a comment in the source noting which error classes are intentionally not caught would help future readers.

Verdict: PASS WITH NITS

The fix works. Item 2 is the main nit worth addressing before merge — the parse location in the error message is the difference between "delete everything and re-login" and "fix line 3 of your config file".

🤖 Reviewed by Henrik's AI-Powered Bug Finder

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants