Summary
The current build-release.yml critical path is 17-18m for tag releases (excluding macOS ARM queue which adds 0-6h). After PR #368 (macOS consolidation, merged), this plan targets ~2m for push-to-main and ~8m for tag releases through 9 validated optimizations across all 3 workflows. All changes were validated through structured expert panel analysis with parallel red-team and domain-specific assessments.
Problem
Measured Baselines
Job
Duration
Runner
test (Linux x86)
55s
ubuntu-24.04
test (Linux ARM)
62s
ubuntu-24.04-arm
test (Windows)
92-136s
windows-latest
Build binary (Linux x86)
47s
ubuntu-24.04
Build binary (Linux ARM)
50s
ubuntu-24.04-arm
Build binary (Windows)
66-80s
windows-latest
macOS Intel (consolidated)
115s
macos-15-intel
macOS ARM (consolidated)
~120s + queue
macos-latest
Integration tests (Linux x86)
135-238s
ubuntu-24.04
Release validation (any)
67-99s
varies
create-release
~60s
ubuntu-24.04
Optimizations
O1: Remove setup-node@v4 from unit test/build jobs
Risk: Low
Savings: ~15-25s per job (4 jobs total)
Safe to remove from: test + build in build-release.yml, test in ci.yml, smoke-test in ci-integration.yml
MUST KEEP in: integration-tests and release-validation (apm runtime setup copilot → npm install -g)
Red-team validated: exact boundary identified
O2: Replace curl | sh with astral-sh/setup-uv@v7
Risk: Low
Savings: ~20-40s per job + supply chain hardening
O3: Merge test + build into build-and-test
Risk: Low
Savings: ~1.5m (eliminates runner re-provisioning)
Red-team validated: no outputs consumed, identical matrices, dependency extras are a superset
macOS jobs become root nodes (no needs:) — they run their own tests
8 downstream needs: arrays rewritten
O5: Cache UPX installation
Risk: Low
Savings: ~15-30s per job
Linux: cache binary or pin GitHub Release download
macOS: direct release binary download instead of brew install
O9: Decouple live inference from release pipeline
Risk: Low (expert panel unanimous)
Savings: Eliminates 14.9% false-negative rate, reduces secret exposure
Remove apm run tests from integration-tests and release-validation
Remove GH_MODELS_PAT from release pipeline jobs
Create ci-runtime.yml — nightly + manual dispatch + path-filtered, Linux x86_64 only
Keep hero scenario validation up to apm compile step (validates 95% of plumbing)
Rationale:
APM's core UVPs (install, compile, audit, governance) require zero live inference
apm run is explicitly marked EXPERIMENTAL in documentation
8 inference executions × 2% failure rate = 14.9% false-negative per release
GH_MODELS_PAT in 12+ jobs = unnecessary secret exposure surface
Industry precedent: npm, cargo, pip, Homebrew, Docker CLI all isolate external API tests from release gates
O7: Fix ci-integration.yml circular dependency
Risk: Medium
Problem: When ci.yml has a bug → ci-integration approval skips → all tests skip → failure status → CI-fixing PR blocked
Design a bypass for PRs that only modify .github/workflows/
O8: Add PR traceability to ci-integration.yml
Risk: None
Improvement: DX enhancement
::notice annotation with originating PR URL in smoke-test job
Pipeline Diagrams
Current Pipeline (tag release)
graph LR
test["🧪 test<br/>(3 platforms)"]
build["📦 build<br/>(3 platforms)"]
mac_intel["🍎 macOS Intel<br/>(build+validate)"]
mac_arm["🍎 macOS ARM<br/>(build+validate)"]
integ["🔬 integration-tests<br/>(3 platforms)"]
relval["✅ release-validation<br/>(3 platforms)"]
release["🏷️ create-release"]
compat["🔗 gh-aw-compat"]
pypi["📤 publish-pypi"]
brew["🍺 update-homebrew"]
scoop["🪣 update-scoop"]
test --> build
test --> mac_intel
test --> mac_arm
test --> integ
build --> integ
test --> relval
build --> relval
integ --> relval
test --> release
build --> release
mac_intel --> release
mac_arm --> release
integ --> release
relval --> release
release --> compat
compat --> pypi
pypi --> brew
pypi --> scoop
Loading
Proposed Pipeline (tag release, after optimization)
graph LR
bat["🧪📦 build-and-test<br/>(3 platforms)"]
mac_intel["🍎 macOS Intel<br/>(build+validate)"]
mac_arm["🍎 macOS ARM<br/>(build+validate)"]
integ["🔬 integration-tests<br/>(3 platforms, no inference)"]
relval["✅ release-validation<br/>(3 platforms, no inference)"]
release["🏷️ create-release"]
compat["🔗 gh-aw-compat"]
pypi["📤 publish-pypi"]
brew["🍺 update-homebrew"]
scoop["🪣 update-scoop"]
bat --> integ
bat --> relval
integ --> relval
bat --> release
mac_intel --> release
mac_arm --> release
integ --> release
relval --> release
release --> compat
compat --> pypi
pypi --> brew
pypi --> scoop
Loading
Separate: Runtime Inference Validation (ci-runtime.yml)
graph LR
trigger["⏰ Nightly / 🔀 Path filter / 👆 Manual"]
inference["🤖 live-inference-smoke<br/>(Linux x86_64 only)"]
report["📊 Status annotation"]
trigger --> inference
inference --> report
Loading
Implementation Phases
Phase 1: Quick wins (no structural change)
Phase 2: Structural optimizations
Phase 3: DX improvements
Expected Outcomes
Pipeline
Current
Target
Improvement
PR CI (ci.yml)
1.7m
~1.3m
-24%
PR CI + Integration
7-11m
~5m
-45%
Push to main
4m
~2m
-50%
Tag release (no queue)
17-18m
~8m
-53%
Tag release (with queue)
30m-6h+
8m + 1× ARM queue
eliminated 3 queue slots
Validation Methodology
All changes were validated through a structured expert panel process:
Round 1 (Red Team) : 4 parallel agents assessed structural changes (Node.js removal, setup-uv migration, test+build merge, ci-integration analysis)
Round 2 (Inference Panel) : 3 parallel domain experts assessed live inference in CI/CD (Product Strategy Architect, Release Reliability Architect, Supply Chain Security Engineer)
Decisions and Trade-offs
Rejected optimizations
O4: Parallelize integration tests with builds — Integration tests use the compiled binary (verified: test-integration.sh downloads artifact, sets up binary in PATH). Dependency is correct.
O6: Flatten release tail — gh-aw-compat downloads the just-released binary via microsoft/apm-action@v1. Chain must stay sequential.
Inference decoupling rationale (O9)
APM's core UVPs (install, compile, audit, governance) require zero live inference
apm run is explicitly marked EXPERIMENTAL in documentation
8 inference executions × 2% failure rate = 14.9% false-negative per release
GH_MODELS_PAT in 12+ jobs = unnecessary secret exposure surface
Industry precedent: npm, cargo, pip, Homebrew, Docker CLI all isolate external API tests from release gates
Summary
The current
build-release.ymlcritical path is 17-18m for tag releases (excluding macOS ARM queue which adds 0-6h). After PR #368 (macOS consolidation, merged), this plan targets ~2m for push-to-main and ~8m for tag releases through 9 validated optimizations across all 3 workflows. All changes were validated through structured expert panel analysis with parallel red-team and domain-specific assessments.Problem
test → buildsequential dependency wastes ~1.5m of runner re-provisioningcurl | shuv installations are a supply chain risk (tracked in ci: replacecurl | shuv installs withastral-sh/setup-uvaction for supply-chain hardening #369)apm run) gate releases on external GitHub Models API — 14.9% false-negative probability per releaseGH_MODELS_PATinjected into 12+ job instances across the pipelineMeasured Baselines
Optimizations
O1: Remove
setup-node@v4from unit test/build jobsRisk: Low
Savings: ~15-25s per job (4 jobs total)
test+buildin build-release.yml,testin ci.yml,smoke-testin ci-integration.ymlintegration-testsandrelease-validation(apm runtime setup copilot → npm install -g)O2: Replace
curl | shwithastral-sh/setup-uv@v7Risk: Low
Savings: ~20-40s per job + supply chain hardening
actions/cache@v3for uv (setup-uv caches natively)curl | shuv installs withastral-sh/setup-uvaction for supply-chain hardening #369O3: Merge
test+buildintobuild-and-testRisk: Low
Savings: ~1.5m (eliminates runner re-provisioning)
needs:) — they run their own testsneeds:arrays rewrittenO5: Cache UPX installation
Risk: Low
Savings: ~15-30s per job
brew installO9: Decouple live inference from release pipeline
Risk: Low (expert panel unanimous)
Savings: Eliminates 14.9% false-negative rate, reduces secret exposure
apm runtests from integration-tests and release-validationGH_MODELS_PATfrom release pipeline jobsci-runtime.yml— nightly + manual dispatch + path-filtered, Linux x86_64 onlyapm compilestep (validates 95% of plumbing)Rationale:
apm runis explicitly marked EXPERIMENTAL in documentationGH_MODELS_PATin 12+ jobs = unnecessary secret exposure surfaceO7: Fix ci-integration.yml circular dependency
Risk: Medium
Problem: When ci.yml has a bug → ci-integration approval skips → all tests skip → failure status → CI-fixing PR blocked
Design a bypass for PRs that only modify
.github/workflows/O8: Add PR traceability to ci-integration.yml
Risk: None
Improvement: DX enhancement
::noticeannotation with originating PR URL in smoke-test jobPipeline Diagrams
Current Pipeline (tag release)
graph LR test["🧪 test<br/>(3 platforms)"] build["📦 build<br/>(3 platforms)"] mac_intel["🍎 macOS Intel<br/>(build+validate)"] mac_arm["🍎 macOS ARM<br/>(build+validate)"] integ["🔬 integration-tests<br/>(3 platforms)"] relval["✅ release-validation<br/>(3 platforms)"] release["🏷️ create-release"] compat["🔗 gh-aw-compat"] pypi["📤 publish-pypi"] brew["🍺 update-homebrew"] scoop["🪣 update-scoop"] test --> build test --> mac_intel test --> mac_arm test --> integ build --> integ test --> relval build --> relval integ --> relval test --> release build --> release mac_intel --> release mac_arm --> release integ --> release relval --> release release --> compat compat --> pypi pypi --> brew pypi --> scoopProposed Pipeline (tag release, after optimization)
graph LR bat["🧪📦 build-and-test<br/>(3 platforms)"] mac_intel["🍎 macOS Intel<br/>(build+validate)"] mac_arm["🍎 macOS ARM<br/>(build+validate)"] integ["🔬 integration-tests<br/>(3 platforms, no inference)"] relval["✅ release-validation<br/>(3 platforms, no inference)"] release["🏷️ create-release"] compat["🔗 gh-aw-compat"] pypi["📤 publish-pypi"] brew["🍺 update-homebrew"] scoop["🪣 update-scoop"] bat --> integ bat --> relval integ --> relval bat --> release mac_intel --> release mac_arm --> release integ --> release relval --> release release --> compat compat --> pypi pypi --> brew pypi --> scoopSeparate: Runtime Inference Validation (ci-runtime.yml)
graph LR trigger["⏰ Nightly / 🔀 Path filter / 👆 Manual"] inference["🤖 live-inference-smoke<br/>(Linux x86_64 only)"] report["📊 Status annotation"] trigger --> inference inference --> reportImplementation Phases
Phase 1: Quick wins (no structural change)
curl | shuv installs withastral-sh/setup-uvaction for supply-chain hardening #369)Phase 2: Structural optimizations
Phase 3: DX improvements
Expected Outcomes
Validation Methodology
All changes were validated through a structured expert panel process:
Decisions and Trade-offs
Rejected optimizations
gh-aw-compatdownloads the just-released binary viamicrosoft/apm-action@v1. Chain must stay sequential.Inference decoupling rationale (O9)
apm runis explicitly marked EXPERIMENTAL in documentationGH_MODELS_PATin 12+ jobs = unnecessary secret exposure surface