Skip to content

Add E2E tests to CI with PR comment reporting#49

Merged
janisz merged 9 commits intomainfrom
e2e_in_ci
Mar 3, 2026
Merged

Add E2E tests to CI with PR comment reporting#49
janisz merged 9 commits intomainfrom
e2e_in_ci

Conversation

@janisz
Copy link
Contributor

@janisz janisz commented Mar 2, 2026

No description provided.

Adds GitHub Actions workflow to run E2E tests on every PR with:
- Automatic cancellation of old runs when new commits are pushed
- Single PR comment that updates with results instead of creating duplicates
- Test results showing commit SHA and workflow run link
- Uses OpenAI API key from GitHub secrets for LLM judge

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@github-actions
Copy link

github-actions bot commented Mar 2, 2026

E2E Test Results

Commit: 1a524dc
Workflow Run: View Details

=== Evaluation Summary ===

  ✓ list-clusters (assertions: 3/3)
  ✓ cve-detected-workloads (assertions: 3/3)
  ✓ cve-detected-clusters (assertions: 3/3)
  ~ cve-nonexistent (assertions: 2/3)
      - MaxToolCalls: Too many tool calls: expected <= 3, got 5
  ✓ cve-cluster-does-exist (assertions: 3/3)
  ~ cve-cluster-does-not-exist (assertions: 2/3)
      - ToolsUsed: Required tool not called: server=stackrox-mcp, tool=, pattern=list_clusters
  ✓ cve-clusters-general (assertions: 3/3)
  ✓ cve-cluster-list (assertions: 3/3)
  ✓ cve-log4shell (assertions: 3/3)
  ✓ cve-multiple (assertions: 3/3)
  ✓ rhsa-not-supported (assertions: 2/2)

Tasks:      11/11 passed (100.00%)
Assertions: 30/32 passed (93.75%)

janisz and others added 6 commits March 2, 2026 17:59
- Remove redundant dependency download (Go handles automatically)
- Remove redundant mcpchecker build (script handles it)
- Remove unnecessary EXIT_CODE capture
- Add failure comment when job fails

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Use built-in mcpchecker summary instead of manual jq parsing
- Add results to GitHub Actions step summary
- Cleaner and more maintainable output generation

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Use the built-in text output from mcpchecker summary directly
instead of creating custom markdown tables. Simpler and cleaner.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Changed from claude-code to OpenAI agent to support CI environments
where Claude CLI is not available. Using gpt-5-nano for cost efficiency.

- Update eval.yaml to use builtin.openai-agent with gpt-5-nano
- Set MODEL_BASE_URL and MODEL_KEY env vars in run-tests.sh
- Pass MODEL_KEY in GitHub Actions workflow

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Tomasz Janiszewski <tomek@redhat.com>
@codecov-commenter
Copy link

codecov-commenter commented Mar 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 78.37%. Comparing base (ce91d76) to head (1a524dc).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             main      #49   +/-   ##
=======================================
  Coverage   78.37%   78.37%           
=======================================
  Files          27       27           
  Lines        1216     1216           
=======================================
  Hits          953      953           
  Misses        223      223           
  Partials       40       40           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@janisz janisz requested a review from mtodor March 3, 2026 10:49
janisz and others added 2 commits March 3, 2026 17:52
Co-authored-by: Mladen Todorovic <mtodor@gmail.com>
Signed-off-by: Tomasz Janiszewski <tomek@redhat.com>
@janisz janisz merged commit 784bec2 into main Mar 3, 2026
6 checks passed
@janisz janisz deleted the e2e_in_ci branch March 3, 2026 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants