fix(ci): benchmarks and windows build#148
Conversation
…alysis Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…e detection Temporary implementation using `whoami /priv` and `whoami /groups` to avoid unsafe Win32 FFI calls, preserving the workspace-level `unsafe_code = "forbid"` policy. This will be replaced by the `token-privilege` crate once it is built. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…ivilege crate Remove the whoami shell-out approach. Windows now runs in degraded mode until the token-privilege crate provides safe FFI wrappers for SeDebugPrivilege and elevation detection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
|
Caution Review failedFailed to post review comments Summary by CodeRabbit
WalkthroughExtracts benchmarking into a dedicated GitHub Actions workflow, adds focused Criterion bench modules and helpers, splits the large performance bench into multiple suites, shortens/simplifies some benches, updates tooling versions, adds a just target and docs, and replaces Windows privilege detection with a degraded-mode stub. Changes
Sequence Diagram(s)sequenceDiagram
participant Dev as Developer
participant GH as GitHub Actions
participant Runner as Runner (ubuntu-latest)
participant Cargo as Cargo/Rust toolchain
participant Cache as Cache/Artifacts
rect rgba(99,102,241,0.5)
Dev->>GH: Manually trigger "Benchmarks" with suite
GH->>Runner: Start job (benchmarks / load-tests)
Runner->>Runner: checkout code
Runner->>Runner: install deps (mise-action) & restore cache
Runner->>Cargo: run `cargo bench` or `cargo test` (suite-specific)
Cargo-->>Runner: benchmark/test output
Runner->>Runner: scan output for regressions
Runner->>Cache: upload artifacts (benchmark-results / load-test-results)
Runner->>Cache: save/update Criterion baselines (on main)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 CI must passWonderful, this rule succeeded.All CI checks must pass. Release-plz PRs are exempt because they only bump versions and changelogs (code was already tested on main), and GITHUB_TOKEN-triggered force-pushes suppress CI.
🟢 Do not merge outdated PRsWonderful, this rule succeeded.Make sure PRs are within 3 commits of the base branch before merging
|
|
Related Documentation 5 document(s) may need updating based on files changed in this PR: DaemonEye CI and Automation HooksView Suggested Changes@@ -4,7 +4,8 @@
All CI/CD workflows are defined in the `.github/workflows/` directory. Key workflows include:
-- `ci.yml`: Main CI pipeline for code quality, testing, cross-platform builds, coverage reporting, and performance benchmarks ([ci.yml](https://github.com/EvilBit-Labs/DaemonEye/blob/a169b6886897bf281c27ce9d86d4738233658cd4/.github/workflows/ci.yml)).
+- `ci.yml`: Main CI pipeline for code quality, testing, cross-platform builds, and coverage reporting ([ci.yml](https://github.com/EvilBit-Labs/DaemonEye/blob/a169b6886897bf281c27ce9d86d4738233658cd4/.github/workflows/ci.yml)).
+- `benchmarks.yml`: Performance benchmarks and load tests, run on-demand ([benchmarks.yml](https://github.com/EvilBit-Labs/DaemonEye/blob/main/.github/workflows/benchmarks.yml)).
- `docs.yml`: Documentation build and deployment to GitHub Pages ([docs.yml](https://github.com/EvilBit-Labs/DaemonEye/blob/a169b6886897bf281c27ce9d86d4738233658cd4/.github/workflows/docs.yml)).
- `codeql.yml`: Code analysis and security scanning ([codeql.yml](https://github.com/EvilBit-Labs/DaemonEye/blob/a169b6886897bf281c27ce9d86d4738233658cd4/.github/workflows/codeql.yml)).
- `scorecard.yml`: Supply-chain security analysis using OpenSSF Scorecard ([scorecard.yml](https://github.com/EvilBit-Labs/DaemonEye/blob/main/.github/workflows/scorecard.yml)).
@@ -36,7 +37,6 @@
- `test` runs independently
- `test-cross-platform` depends on `quality`
- `coverage` depends on `test`, `test-cross-platform`, and `quality`
-- `benchmarks` depends on `test`
The Scorecard workflow runs its analysis job only on the default branch or for pull requests, and uploads results to both the code scanning dashboard and as artifacts.
@@ -50,14 +50,30 @@
Documentation is built using `mdBook` and Rustdoc, with plugins for enhanced features. The workflow builds and deploys documentation to GitHub Pages, triggered by code changes or manual dispatch ([docs.yml](https://github.com/EvilBit-Labs/DaemonEye/blob/a169b6886897bf281c27ce9d86d4738233658cd4/.github/workflows/docs.yml)).
-**Benchmarks Job**
+**Benchmarks Workflow**
-Performance benchmarks are executed using Criterion after tests complete. Key features:
-- Caches baseline benchmarks for regression detection
-- Fails CI if performance regressions exceed 10% threshold
-- Runs load tests to validate system behavior under stress
+Performance benchmarks and load tests are executed in a separate `.github/workflows/benchmarks.yml` workflow that runs on-demand via manual trigger (`workflow_dispatch`). This workflow is independent of the main CI pipeline, allowing developers to run performance tests when needed without impacting regular CI execution times.
+
+The workflow provides a configurable `suite` input parameter with the following options:
+- `all` (default): Runs all benchmark suites
+- `performance_benchmarks`: Runs only the performance benchmark suite
+- `process_collector_benchmarks`: Runs only the process collector benchmark suite
+
+The workflow contains two independent jobs that run in parallel:
+
+**Benchmarks Job** (15-minute timeout):
+- Runs `cargo bench --package procmond` (or specific suites based on input)
+- Restores baseline benchmarks from cache for regression detection
+- Checks for performance regressions and logs warnings if detected (does not fail the build)
+- Saves new baseline benchmarks to cache only when running on the main branch
- Uploads benchmark results as artifacts with 30-day retention
-- Only saves new baselines when running on the main branch
+
+**Load Tests Job** (10-minute timeout):
+- Runs `cargo test --package procmond --test load_tests -- --ignored --nocapture`
+- Validates system behavior under stress
+- Uploads load test results as artifacts with 30-day retention
+
+Unlike the previous CI integration, performance regressions now log warnings but do not fail the build, allowing for more flexible performance monitoring while still alerting maintainers to potential issues.
**Mergify Integration**
@@ -98,6 +114,7 @@
**References**
- [ci.yml](https://github.com/EvilBit-Labs/DaemonEye/blob/a169b6886897bf281c27ce9d86d4738233658cd4/.github/workflows/ci.yml)
+- [benchmarks.yml](https://github.com/EvilBit-Labs/DaemonEye/blob/main/.github/workflows/benchmarks.yml)
- [docs.yml](https://github.com/EvilBit-Labs/DaemonEye/blob/a169b6886897bf281c27ce9d86d4738233658cd4/.github/workflows/docs.yml)
- [codeql.yml](https://github.com/EvilBit-Labs/DaemonEye/blob/a169b6886897bf281c27ce9d86d4738233658cd4/.github/workflows/codeql.yml)
- [scorecard.yml](https://github.com/EvilBit-Labs/DaemonEye/blob/main/.github/workflows/scorecard.yml)✅ Accepted contributing
|
There was a problem hiding this comment.
Pull request overview
Refactors CI/benchmarking and Windows privilege handling in procmond, primarily by moving benchmark/load-test execution into a dedicated GitHub Actions workflow and simplifying benchmark suites while keeping the workspace unsafe-free on Windows builds.
Changes:
- Moved benchmarks + load tests out of the main CI workflow into a new
.github/workflows/benchmarks.yml. - Simplified and sped up
procmondCriterion benchmark suites by reducing measurement times and removing load-simulation scenarios. - Stubbed Windows privilege detection to avoid unsafe Win32 token inspection until an external safe wrapper crate is available.
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
procmond/src/security.rs |
Removes Win32 token privilege checks and stubs Windows security context detection. |
procmond/benches/process_collector_benchmarks.rs |
Removes artificial delay/load benchmark and trims benchmark scale parameters. |
procmond/benches/performance_benchmarks.rs |
Reduces measurement times and trims batch/event count ranges for faster runs. |
mise.toml |
Updates pinned tool versions used by mise. |
mise.lock |
Regenerates the mise lockfile to match updated tool versions. |
justfile |
Adds a bench-procmond convenience recipe. |
.github/workflows/ci.yml |
Removes the benchmarks/load-tests job from main CI. |
.github/workflows/benchmarks.yml |
Adds a dedicated workflow to run benchmarks and load tests. |
🧪 CI InsightsHere's what we observed from your CI run for facb138. 🟢 All jobs passed!But CI Insights is watching 👀 |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
…e checks Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
feat(benches): implement system-level performance benchmarks for process collection and RPC fix(security): update Windows privilege detection documentation and tracking issue Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
| - **Suite Selection**: Choose which benchmark suite to run: "all" (default), "performance_benchmarks", or "process_collector_benchmarks" | ||
| - **Workflow Artifacts**: Download `benchmark-results` and `load-test-results` artifacts from the workflow summary page | ||
| - **CI Logs**: View benchmark output directly in the workflow logs | ||
| - **Performance Alerts**: Regressions are logged as warnings for manual review without failing the workflow |
There was a problem hiding this comment.
The docs say benchmark regressions are "logged as warnings but do not fail the build/workflow", but the actual .github/workflows/benchmarks.yml step exits non-zero when regressions exceed the threshold (currently 20%). Update the documentation to reflect the current behavior (or adjust the workflow to match the documented non-failing behavior), otherwise this will surprise people relying on the docs.
| - **Performance Alerts**: Regressions are logged as warnings for manual review without failing the workflow | |
| - **Performance Alerts**: Significant regressions (currently >20% slowdown versus the `main` baseline) cause the benchmarks job to fail, while smaller regressions are logged as warnings for manual review |
| let _ = unsafe { windows::Win32::Foundation::CloseHandle(token_handle) }; | ||
|
|
||
| info.is_ok() && elevation.TokenIsElevated != 0 | ||
| warn!("Windows privilege detection not yet implemented; running in degraded mode"); |
There was a problem hiding this comment.
detect_privileges() already logs a warning when ctx.degraded_mode is true. With the new Windows stub, detect_windows_privileges() also emits a warning, so Windows will log two warnings every time privilege detection runs. Consider downgrading the Windows-specific message to debug!/info!, or removing it and relying on the outer degraded-mode warning (optionally adding a capability note like "windows_privileges_unimplemented" for visibility).
| warn!("Windows privilege detection not yet implemented; running in degraded mode"); | |
| info!("Windows privilege detection not yet implemented; running in degraded mode"); |
| let initial_file_count = std::fs::read_dir(temp_dir.path()) | ||
| .map(|entries| { | ||
| entries | ||
| .filter_map(|e| e.ok()) | ||
| .filter(|e| e.path().extension().is_some_and(|ext| ext == "wal")) | ||
| .count() | ||
| }) | ||
| .unwrap_or_else(|_err| { | ||
| eprintln!("WARNING: Could not measure memory for current process"); | ||
| 0 | ||
| }); |
There was a problem hiding this comment.
The fallback warning message here says "Could not measure memory for current process", but this code path is handling a failure to read the temp directory to count .wal files. This looks like a copy/paste and makes benchmark output misleading; update the message to reflect the actual operation (e.g., failed to read WAL directory / count WAL files).
| ```yaml | ||
| name: Benchmarks | ||
|
|
||
| on: | ||
| workflow_dispatch: | ||
| inputs: | ||
| suite: | ||
| description: "Benchmark suite to run" | ||
| required: false | ||
| default: "all" | ||
| type: choice | ||
| options: | ||
| - all | ||
| - performance_benchmarks | ||
| - process_collector_benchmarks |
There was a problem hiding this comment.
The benchmarking workflow example in the docs still lists performance_benchmarks as a selectable suite, but that bench was removed and the actual workflow input options are now wal_benchmarks, eventbus_benchmarks, serialization_benchmarks, system_benchmarks, and process_collector_benchmarks. Please update the options list (and any surrounding text) to match the real .github/workflows/benchmarks.yml input choices so readers can run the documented suites successfully.
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
This pull request refactors and improves the benchmarking infrastructure for the project. The main changes include moving benchmark and load test jobs out of the main CI workflow into a dedicated GitHub Actions workflow, updating tool versions for better compatibility and performance, and streamlining benchmark suites for faster execution and easier maintenance. Additionally, several benchmark parameters and test cases have been adjusted to focus on more representative workloads and reduce unnecessary complexity. This also intentionally stubs out the Windows token checks pending the creation of the EvilBit-Labs/token-privilege crate.
Workflow and Infrastructure Changes:
.github/workflows/benchmarks.ymlworkflow for running benchmarks and load tests, separating them from the main CI workflow and improving maintainability..github/workflows/ci.ymlto avoid duplication and streamline CI.Tooling Updates:
mise.toml(e.g.,cargo-binstall,cargo-insta,cargo-audit,cargo-llvm-cov,cargo-nextest,cargo-release,cargo-auditable,actionlint,lychee,markdownlint-cli2,protobuf,protoc) for improved compatibility and reliability. [1] [2]Benchmark Suite and Parameter Adjustments:
procmond/benches/performance_benchmarks.rsto speed up test execution and focus on relevant scenarios. [1] [2] [3] [4] [5] [6] [7] [8] [9]Justfile Improvements:
bench-procmondtarget to thejustfilefor running procmond-specific benchmarks, making it easier to invoke relevant tests.