Draft
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR continues the migration to a Nix-based build and CI workflow, replacing the prior compile-env/docker-based approach and wiring sysroot/toolchain configuration through Nix shells and Nix builds.
Changes:
- Replaces the legacy compile-env + fake-nix workflow with
default.nix/overlays,nix-shell, and updatedjustrecipes. - Updates CI (
dev.yml) to build/test via Nix targets and introduces new Nix packaging pieces (FRR packaging, platform/profile plumbing). - Refactors sysroot usage in Rust build scripts and updates docs to match the new Nix-first workflow.
Reviewed changes
Copilot reviewed 55 out of 56 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| testing.md | Updates testing instructions to assume nix-shell tooling. |
| sysfs/build.rs | Removes sysroot build script logic. |
| sysfs/Cargo.toml | Drops dpdk-sysroot-helper build-dependency. |
| shell.nix | Switches shell entrypoint to default.nix devenv. |
| scripts/update-doc-headers.sh | Bumps KaTeX version used in docs. |
| scripts/todo.sh | Adds a Nix-based build/test “checklist” script. |
| scripts/test-runner.sh | Removes legacy docker-based test runner wrapper. |
| scripts/rust.env | Removes legacy RUSTFLAGS/profile env file. |
| scripts/k8s-crd.env | Updates gateway CRD ref env file (now likely legacy). |
| scripts/installl-real-nix.sh | Adds helper to replace “fake nix” with real Nix install. |
| scripts/dpdk-sys.env | Updates pinned dpdk-sys commit. |
| scripts/doc/custom-header.html | Updates KaTeX CDN links and integrity hashes. |
| rust-toolchain.toml | Removes rustup toolchain file in favor of Nix toolchain sourcing. |
| routing/Cargo.toml | Cleans tokio features and adds dev tokio “full”. |
| npins/sources.json | Updates Nix pins (crane, frr, gateway, nixpkgs, rust, rust-overlay). |
| nix/profiles.nix | Adjusts compile/link/security profile flags and profile mapping. |
| nix/platforms.nix | Adds platform name mapping for bluefield2 → bluefield. |
| nix/pkgs/frr/patches/yang-hack.patch | Adds FRR/libyang-related patch. |
| nix/pkgs/frr/patches/xrelifo.py.fix.patch | Adds FRR python/xrelfo patch. |
| nix/pkgs/frr/default.nix | Introduces FRR derivation with configurable protocol support. |
| nix/pkgs/frr/clippy-helper.nix | Adds split derivation for FRR “clippy” tool for cross builds. |
| nix/pkgs/dpdk/default.nix | Simplifies DPDK build params and uses platform-provided properties. |
| nix/overlays/llvm.nix | Reworks LLVM+Rust toolchain overlay to source versions from pins. |
| nix/overlays/frr.nix | Adds overlay customizing dependencies for FRR static/cross builds. |
| nix/overlays/default.nix | Registers new overlays (rust/llvm/dataplane/frr). |
| nix/overlays/dataplane.nix | Wires platform/profile into DPDK build and tweaks deps. |
| nix/overlays/dataplane-dev.nix | Uses llvmPackages’ stdenv and adds a static-leaning gdb override. |
| net/src/buffer/test_buffer.rs | Cleans doc-only import; adds explicit PacketBuffer doc link. |
| mgmt/tests/reconcile.rs | Adds VM-runner attribute to a test. |
| mgmt/src/tests/mgmt.rs | Removes unused imports and disables a VM test during refactor. |
| mgmt/Cargo.toml | Adds n-vm + tracing-subscriber for tests. |
| k8s-intf/build.rs | Refactors CRD generation to OUT_DIR and env-driven inputs. |
| k8s-intf/Cargo.toml | Swaps build deps to dpdk-sysroot-helper. |
| justfile | Replaces compile-env/sterile/docker flows with Nix build/test/container commands. |
| init/build.rs | Switches to dpdk_sysroot_helper::use_sysroot() behind feature gate. |
| init/Cargo.toml | Introduces sysroot feature and makes sysroot helper optional. |
| hardware/src/os/mod.rs | Fixes a typo in a clippy lint comment. |
| hardware/build.rs | Switches to centralized use_sysroot(). |
| dpdk/src/lcore.rs | Updates lcore ID call to rte_lcore_id(). |
| dpdk/build.rs | Switches to centralized use_sysroot(). |
| dpdk-sysroot-helper/src/lib.rs | Changes sysroot discovery to DATAPLANE_SYSROOT and adds use_sysroot(). |
| dpdk-sys/build.rs | Updates bindgen/sysroot handling and link libs list. |
| development/code/running-tests.md | Updates test-running docs to Nix-first commands. |
| default.nix | Major Nix build definition: dev shell env, profiles, test archives, container tars. |
| dataplane/src/drivers/dpdk.rs | Gates DPDK driver file behind dpdk feature. |
| dataplane/build.rs | Switches to centralized use_sysroot() behind dpdk feature. |
| dataplane/Cargo.toml | Makes dpdk deps optional behind a dpdk feature (default on). |
| cli/build.rs | Removes sysroot build script logic. |
| cli/Cargo.toml | Drops dpdk-sysroot-helper build-dependency. |
| README.md | Updates developer setup/docs to nix-shell workflow. |
| Cargo.toml | Updates workspace version and dependency versions. |
| Cargo.lock | Updates lockfile to match dependency/version changes. |
| .github/workflows/dev.yml.old | Keeps old workflow as .old (new file added). |
| .github/workflows/dev.yml | Reworks CI to use Nix builds and archives. |
| .envrc | Simplifies direnv env vars for the new devroot/sysroot layout. |
| .cargo/config.toml | Updates env vars and rustflags for sysroot/devroot-based builds. |
d2a1beb to
cddb251
Compare
cddb251 to
3591e49
Compare
3591e49 to
921adf0
Compare
e3be498 to
eb71953
Compare
bae29e6 to
6a688dd
Compare
81e9456 to
0059740
Compare
De-duplicate tokio feature flags in routing/Cargo.toml and add tokio with full features to dev-dependencies for test support. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Co-Authored-By: Daniel Noland <daniel@githedgehog.com>
Update mgmt tests for compatibility with the nix build environment: add n_vm test dependencies, simplify test_sample_config, and add Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Co-Authored-By: Daniel Noland <daniel@githedgehog.com>
Update npins sources (crane, frr, gateway, nixpkgs, rust, rust-overlay) and refresh Cargo.lock. Bump workspace version and update dependency versions in Cargo.toml. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Co-Authored-By: Daniel Noland <daniel@githedgehog.com> Signed-off-by: Manish Vachharajani <manish@githedgehog.com>
Update KaTeX version in custom-header.html and update-doc-headers.sh. Fix a doc typo in hardware/src/os/mod.rs and clean up an unnecessary include in net/src/buffer/test_buffer.rs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Co-Authored-By: Daniel Noland <daniel@githedgehog.com>
Nix now sets up the entire environment, so rust.env is not needed.
Remove all docker/compile-env recipes and variables that are dead code after the migration to nix-based builds. Rewrite build-container and push recipes to use nix build and skopeo directly, and update remaining recipes to call cargo without the old wrapper. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace all references to the old docker/compile-env workflow with the new nix-shell based development environment across README.md, testing.md, and development/code/running-tests.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a justfile recipe to create devroot and sysroot symlinks via nix build, making it easy to set up the local development environment. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use continue-on-error with a per-matrix optional flag so that sanitize/address and sanitize/thread failures show as warnings instead of blocking the workflow. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Manish Vachharajani <manish@githedgehog.com>
Previously, we were using the committed generated file without updating it. This fixes that so that we now generate the kopium gateway_agent_crd.rs file in the target directory and properly use it. A big change here is that the gateway agent version now comes from npins/sources.json and not scripts/k8s-crd.env. The procedure to update the CRD is now in the README.md We also must not exclude json files from the nix sources or the npins files are not available within nix build. Co-authored-by: Daniel Noland <daniel@githedgehog.com> Signed-off-by: Manish Vachharajani <manish@githedgehog.com>
The earlier series of commits adds the address and thread sanitizer to the dev workflows. These fail due to real bugs that need to be addressed. However, that is for later commits. While the sanitizer jobs are marked as optional and do not cause build failure, the summary job still sees them as failed and fails. A future commit should make the summary job somehow look at the optional flag and not fail. Signed-off-by: Manish Vachharajani <manish@githedgehog.com>
Remove install-real-nix.sh and todo.sh as these are not needed. Signed-off-by: Manish Vachharajani <manish@githedgehog.com>
c0b7fa2 to
9d20a83
Compare
fb032ab to
106d5e0
Compare
Add npins entries for new FRR component dependencies: - dplane-plugin (githedgehog/dplane-plugin, master) - dplane-rpc (githedgehog/dplane-rpc, master) - frr-agent (githedgehog/frr-agent, master) - frr: updated revision (stable/10.5) Update scripts/gen-pins.sh with corresponding npins add commands.
Update npins pins: - gateway: v0.42.0 -> v0.43.5 - nixpkgs: channel update - perftest: updated revision - rust: 1.93.1 -> 1.94.0 - rust-overlay: updated revision
- Remove "man" and "dev" outputs from libmd (not needed for this build) - Remove ethtool and iproute2 from rdma-core override inputs
Add nix package definitions for FRR container components: - nix/pkgs/dplane-rpc: C library (cmake build) for dplane RPC - nix/pkgs/dplane-plugin: C library (cmake build) for FRR dplane plugin - nix/pkgs/frr-agent: Rust package for FRR agent - nix/pkgs/frr-config: FRR configuration files package including: - daemon configuration (etc/frr/daemons, vtysh.conf, zebra.conf) - user/group definitions (etc/passwd, etc/group) - nsswitch.conf for DNS resolution - container entrypoint script (libexec/frr/docker-start)
Rework nix/overlays/frr.nix: - Add removeReferencesTo/nukeReferences to strip compiler references from all FRR dependencies via reworked dep function - Rework FRR build LDFLAGS: add readline, json-c, libatomic linking; use --push-state/--pop-state for static linking control - Switch to --disable-static-bin (dynamically linked FRR binaries) - Add preFixup reference stripping with nuke-refs for FRR builds - Add reference stripping to json_c and readline builds - Add readline static+shared configure flags - Switch libelf to shared - Wire in new packages: frr-agent, frr-config, dplane-rpc, dplane-plugin Update nix/pkgs/frr/default.nix: - Add nukeReferences and removeReferencesTo to build inputs - Add commented-out preFixup reference stripping code - Remove unused inputs (nixosTests, xz) - Remove nixosTests passthru
Rework dataplane-tar (formerly min-tar): - Merge the two-stage min-tar + dataplane-tar build into a single dataplane-tar derivation - Add busybox symlinks and workspace binary symlinks directly - Add dontPatchShebangs, dontFixup, dontPatchElf - Use local libc binding, remove bash/ncurses/readline deps - Add seccomp filter comment cleanup Add docs-builder: - New docs-builder function for building rustdoc documentation - Add docs attribute set with per-package and all-docs targets Add tag parameter: - New tag parameter (default "dev") for container image tagging - Add VERSION=tag to cargo build environment - Flip reference-stripping flags to removeReferencesTo* (now removing) Rework container definitions: - Rename package-builder -> workspace-builder - Rename containers.libc -> containers.dataplane with proper ghcr.io name and production contents (busybox, fakeNss, workspace binaries) - Update dataplane-debugger name and tag inheritance - Add containers.frr.dataplane with full FRR stack - Add containers.frr.host with FRR host packages - Add Entrypoint/Cmd config to all containers - Export docs, frr-pkgs, dataplane-tar (replacing min-tar)
Change k8s-intf/build.rs get_gateway_version() to read VERSION from the environment instead of parsing npins/sources.json (old code commented out for reference).
- Add frr.dataplane and frr.host to nix build matrix
- Comment out cargo deny check (TODO: re-enable before merge)
- Reformat sanitizer comments
- Split build/test/push into separate CI steps
- Add per-target push-container steps using just recipes
- Add push container for vlab step
- Use ${UPSTREAM_REGISTRY} for oci_repo instead of hardcoded ghcr.io
- Add FRR version bumping to vlab prebuild
- Remove refresh-compile-env step
- Add FRR version bumping in fabricator bump job
Add local vlab development environment tooling: - Dockerfile for vlab container (ubuntu + docker, qemu, etc.) - run.sh to set up TLS certs, zot OCI registry, and hhfab vlab - control.sh helper to exec into vlab via SSH - Zot registry configuration (cert.ini, config.json) - .gitignore entries for TLS artifacts (*.pem, *.crt, *.key, *.csr, creds.json)
- Use /usr/bin/env bash instead of ${SHELL:-bash} for shell interpreter
- Add docker_sock variable and _setup_docker_env_ helper
- Change default docker socket to /var/run/docker.sock
- Simplify docker env setup: always set DOCKER_HOST and DOCKER_SOCK
unconditionally with unix:// prefix
- Add FRR OCI image variables (oci_frr_prefix, oci_image_frr_dataplane,
oci_image_frr_host)
- Change default oci_repo to 192.168.19.1:30000 (vlab default)
- Add sanitizer component to version string
- Remove bolero-specific sanitizers variable
- Give build recipe a default target (dataplane-tar), add mkdir -p
results, --out-link, --print-build-logs, --show-trace, --argstr tag
- Add tag arg and quoting fixes to setup-roots recipe
- Add test recipe (builds test archive, runs with cargo nextest)
- Add docs recipe
- Rework build-container and push-container into multi-target dispatch
supporting dataplane, dataplane-debugger, frr.dataplane, frr.host
- Add --src-daemon-host to skopeo copy calls in push-container
- Remove load-container (merged into build-container)
- Remove bolero fuzz recipes (list-fuzz-tests, fuzz, fuzz-afl)
c68bc0e to
cd18b19
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR is a continuation of the work started by @daniel-noland to move to a proper nix based build system.
Most of this PR was built based on #1275 and the work of Claude Code using Opus 4.6. As such it should be reviewed carefully. I have tried to do the work in small chunks with the AI to get some review as we go along, but I am not a nix expert and had to rely a bit on the AI's judgement as to the best approach for certain things.
TODO:
Make failing new sanitizer runs optional - the sanitizers found real bugs we need to fix in separate PRsCo-pilot review of this PR before signoffDONERemoveDONEscripts/todo.sh.RemoveDONEscripts/install-real-nix.sh.justtargets for building and pushing containers is there (I believe we are good, but I want to confirm)