Skip to content

CI: split build/test, cross-backend jobs, enable aarch64-linux#163

Open
angerman wants to merge 6 commits intostable-ghc-9.14from
feat/nix-ci-split
Open

CI: split build/test, cross-backend jobs, enable aarch64-linux#163
angerman wants to merge 6 commits intostable-ghc-9.14from
feat/nix-ci-split

Conversation

@angerman
Copy link
Copy Markdown

@angerman angerman commented Mar 2, 2026

Summary

  • Rename ci.ymlnix-ci.yml (closes Move ci.yml to nix-ci.yml #51)
  • Split each CI matrix entry into separate build and test jobs connected by artifacts
  • Enable aarch64-linux in the matrix (devx containers now available)
  • Add dynamic dimension to the matrix (dynamic=0 and dynamic=1)
  • Add cross-js and cross-wasm CI jobs that build cross-target libraries from the dist artifact
  • Makefile: add configurable GHC_TOOLCHAIN_BIN and HAPPY_TEMPLATE_DIR for dist-based cross builds
  • Temporarily disable release workflow while restructuring

Job structure (14 total)

build (6 jobs: 3 platforms × 2 dynamic modes)
  ├── test (6 jobs, downloads dist, runs testsuite)
  ├── cross-js (1 job, builds JS backend from x86_64-linux dist)
  └── cross-wasm (1 job, builds WASM backend, continue-on-error)

Benefits

  • Faster feedback: build failures detected without waiting for tests
  • Rerun tests only: if tests flake, re-run without ~49min rebuild
  • Cross-backend verification: JS and WASM backends tested on every push
  • aarch64-linux coverage: native ARM Linux builds now included

Test plan

  • Build jobs complete and upload artifacts for all 6 matrix entries
  • Test jobs download artifacts and run testsuite successfully
  • Test result artifacts (perf.csv, summary.txt, junit.xml) uploaded correctly
  • cross-js job builds JS cross libraries and passes node smoke test
  • cross-wasm job attempts WASM build (continue-on-error expected)
  • Re-running a test job alone works without triggering a rebuild

Addresses #49, #30, #134.

@angerman angerman requested a review from Copilot March 2, 2026 03:01
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restructures GitHub Actions CI for the Nix-based build to split build/test, add aarch64-linux coverage, and validate cross-backend builds (JS/WASM) using dist artifacts, while temporarily disabling the release workflow during the transition.

Changes:

  • Replaces the old monolithic CI workflow with nix-ci.yml, splitting each matrix entry into separate build and test jobs with uploaded artifacts.
  • Adds CI coverage for aarch64-linux plus a new dynamic matrix dimension, and introduces cross-js / cross-wasm jobs from the dist artifact.
  • Updates the Makefile to support dist-based cross builds via configurable GHC_TOOLCHAIN_BIN and HAPPY_TEMPLATE_DIR.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
Makefile Adds knobs and bindist contents needed to build cross targets from downloaded dist artifacts.
.github/workflows/nix-ci.yml New CI workflow with split build/test, expanded matrix, and cross-backend jobs.
.github/workflows/release.yml Temporarily disables release workflow execution.
.github/workflows/ci.yml Removes the old CI workflow in favor of the new Nix CI.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/nix-ci.yml Outdated
Comment on lines +74 to +77
test:
name: "Test / ${{ matrix.plat }} / dynamic=${{ matrix.dynamic }}"
needs: build
runs-on: "${{ fromJSON('{\"x86_64-linux\": \"ubuntu-24.04\", \"aarch64-linux\": \"ubuntu-24.04-arm\", \"aarch64-darwin\": \"macos-latest\"}')[matrix.plat] }}"
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs: build on a matrix job will block every test run until all build matrix runs finish, which reduces parallelism and can significantly lengthen end-to-end CI time. If the intent is for each (plat,dynamic) test to start as soon as its matching build artifact is uploaded, consider restructuring into explicit per-entry build jobs (e.g., via a reusable workflow invoked once per include entry) so each test job can needs: its corresponding build job, rather than the whole build matrix.

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/nix-ci.yml Outdated
Comment on lines +135 to +137
cross-js:
name: "Cross: JS / x86_64-linux"
needs: build
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both cross jobs only consume x86_64-linux-dynamic0-dist, but needs: build will wait for the entire build matrix to complete before starting. To avoid unnecessary delays, consider introducing a dedicated build job for x86_64-linux dynamic=0 (or splitting the build matrix into per-entry jobs) and have cross-js / cross-wasm depend only on that specific job.

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/nix-ci.yml Outdated
Comment on lines +213 to +215
cross-wasm:
name: "Cross: WASM / x86_64-linux"
needs: build
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both cross jobs only consume x86_64-linux-dynamic0-dist, but needs: build will wait for the entire build matrix to complete before starting. To avoid unnecessary delays, consider introducing a dedicated build job for x86_64-linux dynamic=0 (or splitting the build matrix into per-entry jobs) and have cross-js / cross-wasm depend only on that specific job.

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/nix-ci.yml
Comment on lines +252 to +253
curl -fsSL https://gitlab.haskell.org/ghc/ghc-wasm-meta/-/raw/master/bootstrap.sh | \
FLAVOUR=9.12 PREFIX=$HOME/.ghc-wasm sh
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Piping a remote script directly into sh is a high-risk pattern (integrity/compromise, and master can change). Prefer downloading a specific, pinned revision (or release artifact), validating it (checksum/signature), and then executing it from disk.

Suggested change
curl -fsSL https://gitlab.haskell.org/ghc/ghc-wasm-meta/-/raw/master/bootstrap.sh | \
FLAVOUR=9.12 PREFIX=$HOME/.ghc-wasm sh
# Download a pinned revision of the bootstrap script and execute it from disk
BOOTSTRAP_URL="https://gitlab.haskell.org/ghc/ghc-wasm-meta/-/raw/9572a0c6b4c6c4e0a8a5d4e1f3b4a9e2f1c0d7e3/bootstrap.sh"
BOOTSTRAP_SCRIPT="/tmp/ghc-wasm-bootstrap.sh"
curl -fsSL "$BOOTSTRAP_URL" -o "$BOOTSTRAP_SCRIPT"
chmod +x "$BOOTSTRAP_SCRIPT"
FLAVOUR=9.12 PREFIX="$HOME/.ghc-wasm" sh "$BOOTSTRAP_SCRIPT"

Copilot uses AI. Check for mistakes.
Comment thread Makefile Outdated
Comment on lines +780 to +782
_build/stage3/lib/targets/wasm32-unknown-wasi/lib/settings: _build/stage2/lib/targets/wasm32-unknown-wasi $(GHC_TOOLCHAIN_BIN)
@mkdir -p $(@D)
PATH=/home/hasufell/.ghc-wasm/wasi-sdk/bin:$(PATH) _build/stage1/bin/ghc-toolchain-bin $(GHC_TOOLCHAIN_ARGS) --triple wasm32-unknown-wasi --output-settings -o $@ --cc wasm32-wasi-clang --cxx wasm32-wasi-clang++ --ar ar --ranlib ranlib --ld wasm-ld --merge-objs wasm-ld --merge-objs-opt="-r" --disable-ld-override --disable-tables-next-to-code $(foreach opt,$(WASM_CC_OPTS),--cc-opt=$(opt)) $(foreach opt,$(WASM_CXX_OPTS),--cxx-opt=$(opt))
$(GHC_TOOLCHAIN_BIN) $(GHC_TOOLCHAIN_ARGS) --triple wasm32-unknown-wasi --output-settings -o $@ --cc wasm32-wasi-clang --cxx wasm32-wasi-clang++ --ar ar --ranlib ranlib --ld wasm-ld --merge-objs wasm-ld --merge-objs-opt="-r" --disable-ld-override --disable-tables-next-to-code $(foreach opt,$(WASM_CC_OPTS),--cc-opt=$(opt)) $(foreach opt,$(WASM_CXX_OPTS),--cxx-opt=$(opt))
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule now invokes wasm32-wasi-clang / wasm-ld without ensuring the wasi-sdk bin directory is on PATH (the previous recipe prepended a wasi-sdk path). If users run this target outside the CI step that exports PATH, the target is likely to fail. Consider adding a configurable WASI_SDK_BIN_DIR ?= ... (or similar) and prepending it to PATH in the recipe (or passing absolute tool paths) so the target is self-contained and consistent between CI and local use.

Copilot uses AI. Check for mistakes.
@angerman angerman changed the base branch from stable-ghc-9.14-rebased to stable-ghc-9.14 March 2, 2026 22:35
@angerman angerman force-pushed the feat/nix-ci-split branch 9 times, most recently from f47906e to 1c25e5c Compare March 4, 2026 23:54
@angerman angerman force-pushed the feat/nix-ci-split branch 2 times, most recently from 659432d to 319dc15 Compare March 11, 2026 01:51
@angerman angerman force-pushed the feat/nix-ci-split branch 4 times, most recently from f87b28e to 1752fd7 Compare March 14, 2026 11:08
…tion

Add stage3 cross-compilation targets for WASM and JS backends using
dist-based builds. Configure toolchain arguments for emscripten and
wasi-sdk cross-compilers, including CPP/HS_CPP/CMM_CPP and linker
options.

Use stamp-based order-only prerequisites ($(STAGE1_STAMP),
$(STAGE2_STAMP)) so downstream stages skip already-completed builds
without re-running recipes. Remove recursive make fallback rules
that caused BSD make incompatibility on FreeBSD.

Fix --cc-linker-opt typo to --cc-link-opt for ghc-toolchain.
Restructure nix-ci.yml from a monolithic build into separate jobs:
- Build: stage0+stage1 then stage2, with aggressive intermediate cleanup
- Test: download dist artifact and run testsuite standalone
- Cross: JS (emscripten) and WASM (wasi-sdk) backend builds

Full platform matrix: x86_64-linux (ubuntu-latest), aarch64-linux
(ubuntu-24.04-arm), aarch64-darwin (self-hosted), each with dynamic=0
and dynamic=1 configurations. Cross-builds run in parallel with tests
using continue-on-error.

Pass DYNAMIC to both stage1 and stage2 so configure generates consistent
cabal.project.stage2.settings (shared: True when dynamic). Always pass
USE_SYSTEM_CABAL=1 to stage2 since cabal was already built in stage1.
Add test timing file creation for metrics-test.svg generation.
@angerman angerman force-pushed the feat/nix-ci-split branch from 1752fd7 to 23d9ff2 Compare March 14, 2026 11:49
When the assembler emits R_*_NONE (type 0) relocations against a symbol
that was optimised away (e.g. a zero-length static array), the symbol
may remain in the symbol table as undefined (STT_NOTYPE, addr 0x0).

fillGot() iterates all symbols and tries to resolve undefined ones via
lookupDependentSymbol(). For symbols like these, the lookup fails
because the symbol genuinely doesn't exist anywhere — it was removed by
the compiler. However, the only relocations referencing it are NONE
(no-op), so it never needs to be resolved.

Add symbolHasNonNoneRelocation() which scans all REL/RELA tables to
check whether a symbol is referenced by any relocation with type != 0.
When fillGot() fails to resolve a symbol, it now checks this function
and assigns a dummy address (0xDEAD0000) for symbols that are only
NONE-referenced, rather than returning EXIT_FAILURE.

This fixes the reloc-none test on aarch64-linux with GCC 13/14, where
`static int a[0]` gets optimised away but `.reloc ., R_AARCH64_NONE, a`
still creates an undefined global symbol.
The stamp files (STAGE1_STAMP, STAGE2_STAMP) were created as side effects
of the stage1/stage2 recipes but never declared as Make targets.  This
caused `make _build/dist/ghc.tar.gz` (used by the release workflow) to fail
with "No rule to make target '_build/.stamp-stage1'" because Make could not
resolve the dependency chain:

  ghc.tar.gz -> stage2 -> $(GHC1) -> $(STAGE1_STAMP) -> ???

The nix-ci workflow was unaffected because it runs `make stage1` and
`make stage2` as separate steps, so the stamp file already existed on disk
when stage2 ran.

Fix: make each stamp file the actual target of its stage's recipe, and
provide PHONY aliases (stage1, stage2) for convenience.  This lets Make
resolve the full dependency graph in a single invocation while still
skipping already-completed stages.
PHONY targets (stable-cabal, libraries/ghc-boot-th-next) as normal
prerequisites of stamp file targets cause Make to always consider
the stamp out-of-date, defeating the incremental build optimization.

Move these to order-only prerequisites (after |) so they ensure
execution order without triggering unnecessary rebuilds.

Also add stable-cabal as order-only prerequisite of the hackage
index rule — it uses $(CABAL) but had no dependency on stable-cabal,
which could race under make -jN from a clean build.
…y version

- ghc.tar.gz: depend on $(STAGE2_STAMP) instead of PHONY stage2 to
  avoid unnecessary tarball rebuilds when stage2 is already current.
- cabal.tar.gz: make stable-cabal order-only (PHONY target).
- haskell-toolchain.tar.gz: depend on $(STAGE2_STAMP) as normal prereq,
  move PHONY targets (stable-cabal, stage3) to order-only.
- test: revert to $(STAGE2_STAMP) instead of PHONY stage2, consistent
  with stamp-based incremental build design.
- happy-lib: replace hardcoded version 2.1.5 with shell glob and
  $(wildcard) to survive version bumps.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move ci.yml to nix-ci.yml

2 participants