Skip to content

feat: add daily contract check for API schema drift detection#338

Open
Astro-Han wants to merge 1 commit intojackwener:mainfrom
Astro-Han:worktree-contract-check
Open

feat: add daily contract check for API schema drift detection#338
Astro-Han wants to merge 1 commit intojackwener:mainfrom
Astro-Han:worktree-contract-check

Conversation

@Astro-Han
Copy link
Contributor

@Astro-Han Astro-Han commented Mar 24, 2026

Description

Add a daily CI workflow that detects schema drift in CLI command output, catching upstream API structural changes before users encounter broken adapters.

Related issue: #50

How it works

  1. Daily cron (contract-check.yml) runs 25 public API commands across 15 sites
  2. Schema extraction (schema.ts) captures field names, types (distinguishing arrays from objects), and presence rates from command output
  3. Snapshot diff compares current schema against previous baseline stored as CI artifacts
  4. Four drift types detected: field_added, field_removed, type_changed, presence_dropped
  5. Drift report uploaded as artifact with human-readable summary in CI logs

Key design decisions

  • Snapshots as CI artifacts (not committed to repo) — zero repo pollution, no push permissions needed
  • Drift preserves baseline — when drift is detected, the old snapshot is NOT updated, so CI keeps failing until the adapter is fixed
  • Command failures don't block — network issues are recorded as failed (with consecutive failure tracking) but don't trigger CI failure; only actual schema drift does. Exception: if ALL commands fail (total outage), CI fails.
  • Atomic file writes — prevents CI cancel-in-progress from corrupting snapshot JSON
  • Third-party action: Uses dawidd6/action-download-artifact (pinned to commit SHA) because GitHub's official actions/download-artifact only supports same-run downloads. This action is the standard solution for cross-workflow artifact access.

Covered commands (25 across 15 sites)

hackernews (top/best/new/show/ask/jobs), v2ex, bloomberg, apple-podcasts, arxiv, bbc, devto, lobsters, stackoverflow, steam, wikipedia, sinafinance, weread, xiaoyuzhou, yollomi

Type of Change

  • ✨ New feature
  • 🔧 CI / build / tooling

Checklist

  • I ran the checks relevant to this PR
  • I updated tests or docs if needed
  • I included output or screenshots when useful

Follow-up suggestion

tests/contract/schema.test.ts is currently only executed in the daily contract-check workflow. Consider adding tests/contract/ to the unit-test job path in ci.yml so these tests also run on PR submissions:

- run: npx vitest run src/ --reporter=verbose --shard=${{ matrix.shard }}/2
+ run: npx vitest run src/ tests/contract/ --reporter=verbose --shard=${{ matrix.shard }}/2

Screenshots / Output

Local first run (17/25 passed, 8 failed due to network)

Schema Contract Check -- 2026-03-24

  ✓ hackernews/top           -- no drift
  ✓ hackernews/best          -- no drift
  ✓ hackernews/new           -- no drift
  ✓ hackernews/show          -- no drift
  ✓ hackernews/ask           -- no drift
  ✓ hackernews/jobs          -- no drift
  ⚠ v2ex/hot                 -- command failed
  ⚠ v2ex/latest              -- command failed
  ✓ apple-podcasts/top       -- no drift
  ✓ apple-podcasts/search    -- no drift
  ✓ arxiv/search             -- no drift
  ✓ devto/top                -- no drift
  ✓ lobsters/hot             -- no drift
  ✓ stackoverflow/hot        -- no drift
  ✓ steam/top-sellers        -- no drift
  ✓ sinafinance/news         -- no drift
  ✓ weread/ranking           -- no drift
  ✓ xiaoyuzhou/podcast       -- no drift
  ✓ yollomi/models           -- no drift

Summary: 17 passed, 0 drifted, 8 failed

Second run — 0 false positives

Summary: 16 passed, 0 drifted, 9 failed

Unit tests — 22 passed

$ npx vitest run tests/contract/schema.test.ts
 ✓ tests/contract/schema.test.ts (22 tests) 4ms

Test Files  1 passed (1)
     Tests  22 passed (22)

…ner#50)

Add a CI workflow that runs 25 public API commands daily, extracts
schema snapshots (field names, types, presence rates), and diffs
against the previous baseline to detect structural changes.

- Four drift types: field_added, field_removed, type_changed, presence_dropped
- Snapshots stored as CI artifacts (90-day retention), not committed to repo
- Drift preserves old baseline until adapter is fixed
- Atomic file writes to prevent CI cancel corruption
- Total outage (0 passed) triggers CI failure
- 22 unit tests covering extraction, diff, and reporting
@Astro-Han Astro-Han force-pushed the worktree-contract-check branch from f5068bf to c2c0668 Compare March 24, 2026 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant