Skip to content

feat: add security audit agent#12

Open
rohitthink wants to merge 3 commits intomainfrom
feat/security-audit-agent
Open

feat: add security audit agent#12
rohitthink wants to merge 3 commits intomainfrom
feat/security-audit-agent

Conversation

@rohitthink
Copy link
Copy Markdown
Owner

Description

Adds a three-part security audit agent that scans FreeFile for PII leaks, exposed secrets, and GitHub configuration loopholes. Motivated by open-sourcing the repo and wanting continuous assurance that nothing sensitive has leaked.

What's included

1. scripts/security_audit.py — deterministic scanner

  • Scans committed files for 17 pattern families (AWS/GitHub/OpenAI/Anthropic/Slack/Stripe tokens, PEM private keys, Indian PAN/Aadhaar/GSTIN/IFSC/mobile, JWT, generic secret assignments)
  • Scans git history via git log -S for known personal values
  • Scans untracked surface for risky extensions not covered by .gitignore
  • Audits .gitignore coverage against a baseline of critical patterns
  • Audits GitHub repo settings via gh api: secret scanning, push protection, Dependabot alerts + security updates, branch protection, webhooks
  • Optional external check (--check-external) to search other public GitHub repos for known personal values
  • CLI: --write-report, --check-external, --repo, --fail-on={warn,critical}, --json-only
  • Exit codes: 0 clean, 1 warnings (med/low), 2 critical/high

2. scripts/security_patterns.json — regex library

Each pattern supports:

  • placeholder_values — exact matches to suppress (e.g., `ABCDE1234F`, `9876543210`)
  • allowlist_context — line keywords that suppress matches (e.g., `test`, `example`, `your_`)

3. scripts/security_allowlist.json — intentional exposures

File-glob + rule + optional value combinations that are known-intentional. Each entry requires a `reason` string for auditability. Supports recursive `**` globs.

Current allowlist covers:

  • Maintainer name + email in `docs/launch/claude-for-oss-application.md` (required for OSS application)
  • Maintainer email in `docs/launch/oss-strengthening-playbook.md` (already public)
  • Placeholder PAN `ABCDE1234F` in frontend components and tests
  • Placeholder mobile in tests
  • History entries for the above

4. Private config (gitignored)

  • `.security/known_values.json` — user's real PAN, name, email for literal matching. Never committed.
  • `.security/known_values.json.example` — template (committed) so others can set up

5. Expanded .gitignore

Adds: `.sqlite`, `.sqlite3`, `.key`, `.pem`, `.pfx`, `.p12`, `secrets.json`, `credentials.json`, `.ssh/`, plus `.security/` state files and `.oss-metrics-history.jsonl`.

Verification performed

Test Expected Actual
Clean run on current HEAD 0 findings, exit 0 ✅ 0 findings, exit 0
Planted AWS/Anthropic/GitHub/PAN secrets 5 findings, exit 2 ✅ 5 findings, exit 2
Allowlist filters placeholder PANs 5 suppressed ✅ 5 suppressed
`` glob matching (`tests//*.py`) Works ✅ Fixed via custom glob_to_regex
py_compile OK ✅ OK

One-time hardening (already applied outside this PR)

gh api -X PUT repos/rohitthink/freefile/vulnerability-alerts
gh api -X PUT repos/rohitthink/freefile/automated-security-fixes

Dependabot alerts + automated security updates are now enabled (were disabled before this work).

Scheduled monitoring

A weekly `freefile-security-audit` scheduled task runs Mondays at 08:43 local time:

  1. Runs the scanner with `--write-report`
  2. Diffs against last week's snapshot in `.security/audit-history/`
  3. Archives today's report as `YYYY-MM-DD.json`
  4. Only notifies if new unexpected findings appear, a CRITICAL persists, or GitHub config drifts from baseline
  5. Never auto-modifies source or git history — recommendations only

How to use locally

# One-time setup: create the known-values file with your real PII
cp .security/known_values.json.example .security/known_values.json
# edit .security/known_values.json with your real PAN/name/email (it's gitignored)

# Run the audit
./venv/bin/python scripts/security_audit.py --write-report

# Optional: also check other public GitHub repos for your personal values
./venv/bin/python scripts/security_audit.py --write-report --check-external

Type of Change

  • New feature
  • Tests (planted-secret verification)
  • Documentation (this PR + docstrings)

Checklist

  • `python -m py_compile scripts/security_audit.py` — OK
  • Scanner runs cleanly on current HEAD (0 findings)
  • Scanner correctly catches planted secrets (exit 2)
  • No new code style issues
  • `.security/known_values.json` confirmed gitignored (contains real PII)
  • `scripts/security_audit.py` and `scripts/security_patterns.json` self-exclude from scanning (they contain their own pattern strings)

ro4git25 and others added 2 commits April 15, 2026 19:06
Introduces a three-part security audit system to detect PII leaks and
configuration loopholes:

1. scripts/security_audit.py — deterministic Python scanner that checks
   committed files, git history, untracked surface, .gitignore coverage,
   and GitHub repo security settings. Supports --write-report, --check-
   external, --repo, --fail-on, and --json-only flags. Exits 2 on
   CRITICAL/HIGH, 1 on warnings, 0 on clean.

2. scripts/security_patterns.json — regex library with severity ratings
   for AWS/GitHub/OpenAI/Anthropic/Slack/Stripe tokens, PEM private keys,
   Indian PAN/Aadhaar/GSTIN/IFSC/mobile, JWT, and generic secret
   assignments. Each pattern supports placeholder_values to filter
   known-good test fixtures and allowlist_context to filter by line
   keywords.

3. scripts/security_allowlist.json — explicit list of intentional
   exposures (e.g., maintainer name/email in OSS grant application docs,
   placeholder PAN values in UI components and tests) with required
   reason strings for auditability. Uses file-globs with ** support.

The scanner also reads .security/known_values.json (gitignored) to
search for the user's real PAN/name/email via literal match in
committed files and git history — far more reliable than regex alone.
A .security/known_values.json.example template ships in the repo.

.gitignore is expanded to close gaps found during initial scan:
*.sqlite, *.sqlite3, *.key, *.pem, *.pfx, *.p12, secrets.json,
credentials.json, .ssh/, plus the .security/ state files.

A weekly scheduled task (freefile-security-audit, Mondays 08:43 local)
runs the scanner, diffs against last week's report, and only notifies
when new unexpected findings appear, a CRITICAL persists, or GitHub
config drifts from baseline.

Verification:
- Dry clean run on current HEAD: 0 findings, 5 allowlisted
- Planted-secret test (AWS/Anthropic/GitHub tokens + non-placeholder
  PAN in scripts/_test_secret_plant.py): 5 findings detected, exit 2
- Dependabot alerts + automated security updates enabled via
  gh api (one-time hardening action)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The backend CI job has been failing since the workflow was introduced
in e52b4a2 — two pre-existing bugs surfaced on this PR's CI run:

1. pytest, pytest-asyncio, and httpx are not in requirements.txt, so
   `pip install -r requirements.txt` leaves them uninstalled and the
   subsequent `pytest tests/` step fails with "command not found".

2. Even with pytest installed, plain `pytest` invocation doesn't add
   the project root to sys.path, so `from backend.main import app`
   in tests/conftest.py fails with ModuleNotFoundError.

Fixes:
- Add pytest>=8.0.0, pytest-asyncio>=0.23.0, httpx>=0.27.0 to
  requirements.txt (consistent with playwright already being there).
- Add a minimal pytest.ini with `pythonpath = .` so tests can import
  the backend package from the project root regardless of how pytest
  is invoked.

Verified locally: all 39 tests pass with plain `pytest tests/`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@rohitthink
Copy link
Copy Markdown
Owner Author

CI fix added (ab5e464)

Pushing this PR surfaced a pre-existing CI bug introduced by e52b4a2 when the workflow was first added. Two issues:

  1. pytest, pytest-asyncio, and httpx were missing from requirements.txt, so pip install -r requirements.txt in CI left them uninstalled and the test step failed with 'pytest: command not found' (which is why CI has been silently red on all prior commits).
  2. Even with pytest installed, plain pytest invocation doesn't add the project root to sys.path, so from backend.main import app in tests/conftest.py would fail with ModuleNotFoundError.

Added:

  • Test deps to requirements.txt
  • Minimal pytest.ini with pythonpath = .

Verified locally: all 39 tests pass with plain pytest tests/. Scope-wise this is orthogonal to the security audit agent, but the CI was blocking this PR and shipping both together unblocks everything else too.

Add show-hn-post.md with the title, body, technical first comment, and
pre-launch checklist for the Day 6 Show HN submission (playbook §5.3).

Mark awesome-privacy #500 and awesome-fastapi #281 as CLOSED in the PR
tracker; both were closed 2026-04-15. #500 was rejected for repo being
too new (<16 weeks) and 0 stars — plan to resubmit at Day 14+ once
metrics improve. Active monitoring narrowed to 3 PRs (#2, #663, #766).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants