GHA: targeted cleanup — preserve BuildKit ccache across runs#229
GHA: targeted cleanup — preserve BuildKit ccache across runs#229rucoder wants to merge 2 commits intolf-edge:eve-kernel-amd64-v6.12.49-genericfrom
Conversation
- actions/checkout: v4 -> v6.0.2 (de0fac2e4500dabe0009e67214ff5f5447ce83dd) - docker/login-action: v3 -> v4.0.0 (b45d80f862d83dbcd57f89517bcf500b2ab88fb2) Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR updates the CI workflows to avoid destructive Docker cleanup on self-hosted runners by switching to targeted image removal and adding BuildKit cache pruning intended to preserve ccache across runs, while also pinning GitHub Actions to SHAs and tightening PR workflow security.
Changes:
- Add
clean-gcc/clean-clangtargets and aclean-image-%rule to remove only the just-built image from linuxkit cache and Docker. - Replace
docker system prune -af/docker volume prune -afwith BuildKit cache diagnostics +buildctl pruneusing a configurable keep-storage setting. - Pin
actions/checkoutanddocker/login-actionto commit SHAs; change PR workflow trigger frompull_request_targettopull_requestand remove Docker Hub login from PR builds.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
Makefile.eve |
Adds targeted cleanup targets for linuxkit cache + Docker images. |
.github/workflows/publish.yml |
Pins actions; replaces “nuclear” prune with targeted cleanup + BuildKit pruning/diagnostics. |
.github/workflows/pr-build.yml |
Switches to pull_request, removes Docker Hub login, and adopts targeted cleanup + BuildKit pruning/diagnostics. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
.github/workflows/pr-build.yml
Outdated
| echo "=== BuildKit disk usage before prune ===" | ||
| docker exec linuxkit-builder buildctl du | tail -n 2 | ||
| echo "=== Reclaiming ===" | ||
| docker exec linuxkit-builder buildctl prune --keep-storage "${BUILDKIT_KEEP_STORAGE_MB:-32000}" | tail -n 1 | ||
| echo "=== BuildKit disk usage after prune ===" | ||
| docker exec linuxkit-builder buildctl du | tail -n 2 |
There was a problem hiding this comment.
This cleanup step will fail the job if the BuildKit container isn’t running (e.g., if the build fails before it starts) because docker exec linuxkit-builder ... will exit non-zero. Since this is an if: always() cleanup, it should be resilient: check for the container first (e.g., docker ps -q -f name=linuxkit-builder) and/or make the docker exec commands non-fatal (|| true) so cleanup doesn’t mask the real failure.
| echo "=== BuildKit disk usage before prune ===" | |
| docker exec linuxkit-builder buildctl du | tail -n 2 | |
| echo "=== Reclaiming ===" | |
| docker exec linuxkit-builder buildctl prune --keep-storage "${BUILDKIT_KEEP_STORAGE_MB:-32000}" | tail -n 1 | |
| echo "=== BuildKit disk usage after prune ===" | |
| docker exec linuxkit-builder buildctl du | tail -n 2 | |
| if docker ps -q -f name=linuxkit-builder >/dev/null; then | |
| echo "=== BuildKit disk usage before prune ===" | |
| docker exec linuxkit-builder buildctl du | tail -n 2 | |
| echo "=== Reclaiming ===" | |
| docker exec linuxkit-builder buildctl prune --keep-storage "${BUILDKIT_KEEP_STORAGE_MB:-32000}" | tail -n 1 | |
| echo "=== BuildKit disk usage after prune ===" | |
| docker exec linuxkit-builder buildctl du | tail -n 2 | |
| else | |
| echo "linuxkit-builder container not running; skipping BuildKit cleanup." | |
| fi |
Makefile.eve
Outdated
| $(IMAGE_TAG_NO_FLAVOR)-$*,) | ||
| clean-image-%: | ||
| $(LK) cache rm $(IMAGE_TAG_WITH_FLAVOR)-$* 2>/dev/null || true | ||
| docker rmi docker.io/$(IMAGE_TAG_WITH_FLAVOR)-$* 2>/dev/null || true |
There was a problem hiding this comment.
clean-image-% hardcodes docker.io/ when removing the image. Since IMAGE_REPOSITORY is configurable (and could include a different registry like ghcr.io/...), this can generate an invalid image reference (e.g., docker.io/ghcr.io/...). Consider removing the hardcoded docker.io/ prefix and using $(IMAGE_TAG_WITH_FLAVOR)-$* directly so cleanup works with any configured registry.
| docker rmi docker.io/$(IMAGE_TAG_WITH_FLAVOR)-$* 2>/dev/null || true | |
| docker rmi $(IMAGE_TAG_WITH_FLAVOR)-$* 2>/dev/null || true |
.github/workflows/publish.yml
Outdated
| echo "=== Reclaiming ===" | ||
| docker exec linuxkit-builder buildctl prune --keep-storage "${BUILDKIT_KEEP_STORAGE_MB:-32000}" | tail -n 1 |
There was a problem hiding this comment.
buildctl prune --keep-storage expects a byte size (often with units like GB). Passing ${BUILDKIT_KEEP_STORAGE_MB:-32000} as a raw number while naming it “_MB” will likely be interpreted as 32000 bytes and prune far more aggressively than intended. Consider either (a) renaming the env var to reflect bytes/units, or (b) appending an explicit unit suffix (e.g., MB/GB) when invoking buildctl so the default keeps ~32GB as described in the PR.
| echo "=== Reclaiming ===" | |
| docker exec linuxkit-builder buildctl prune --keep-storage "${BUILDKIT_KEEP_STORAGE_MB:-32000}" | tail -n 1 | |
| echo "=== Reclaiming===" | |
| docker exec linuxkit-builder buildctl prune --keep-storage "${BUILDKIT_KEEP_STORAGE_MB:-32000}MB" | tail -n 1 |
.github/workflows/publish.yml
Outdated
| echo "=== BuildKit disk usage before prune ===" | ||
| docker exec linuxkit-builder buildctl du | tail -n 2 | ||
| echo "=== Reclaiming ===" | ||
| docker exec linuxkit-builder buildctl prune --keep-storage "${BUILDKIT_KEEP_STORAGE_MB:-32000}" | tail -n 1 | ||
| echo "=== BuildKit disk usage after prune ===" | ||
| docker exec linuxkit-builder buildctl du | tail -n 2 |
There was a problem hiding this comment.
This cleanup step will fail the job if the BuildKit container isn’t running (e.g., if the build fails before it starts) because docker exec linuxkit-builder ... will exit non-zero. Since this is an if: always() cleanup, it should be resilient: check for the container first (e.g., docker ps -q -f name=linuxkit-builder) and/or make the docker exec commands non-fatal (|| true) so cleanup doesn’t mask the real failure.
| echo "=== BuildKit disk usage before prune ===" | |
| docker exec linuxkit-builder buildctl du | tail -n 2 | |
| echo "=== Reclaiming ===" | |
| docker exec linuxkit-builder buildctl prune --keep-storage "${BUILDKIT_KEEP_STORAGE_MB:-32000}" | tail -n 1 | |
| echo "=== BuildKit disk usage after prune ===" | |
| docker exec linuxkit-builder buildctl du | tail -n 2 | |
| if docker ps -q -f name=linuxkit-builder >/dev/null 2>&1; then | |
| echo "=== BuildKit disk usage before prune ===" | |
| docker exec linuxkit-builder buildctl du | tail -n 2 || true | |
| echo "=== Reclaiming ===" | |
| docker exec linuxkit-builder buildctl prune --keep-storage "${BUILDKIT_KEEP_STORAGE_MB:-32000}" | tail -n 1 || true | |
| echo "=== BuildKit disk usage after prune ===" | |
| docker exec linuxkit-builder buildctl du | tail -n 2 || true | |
| else | |
| echo "linuxkit-builder container not running; skipping BuildKit disk cleanup." | |
| fi |
.github/workflows/pr-build.yml
Outdated
| echo "=== BuildKit disk usage before prune ===" | ||
| docker exec linuxkit-builder buildctl du | tail -n 2 | ||
| echo "=== Reclaiming ===" | ||
| docker exec linuxkit-builder buildctl prune --keep-storage "${BUILDKIT_KEEP_STORAGE_MB:-32000}" | tail -n 1 |
There was a problem hiding this comment.
buildctl prune --keep-storage expects a byte size (often with units like GB). Passing ${BUILDKIT_KEEP_STORAGE_MB:-32000} as a raw number while naming it “_MB” will likely be interpreted as 32000 bytes and prune far more aggressively than intended. Consider either (a) renaming the env var to reflect bytes/units, or (b) appending an explicit unit suffix (e.g., MB/GB) when invoking buildctl so the default keeps ~32GB as described in the PR.
| docker exec linuxkit-builder buildctl prune --keep-storage "${BUILDKIT_KEEP_STORAGE_MB:-32000}" | tail -n 1 | |
| docker exec linuxkit-builder buildctl prune --keep-storage "${BUILDKIT_KEEP_STORAGE_MB:-32000}MB" | tail -n 1 |
There was a problem hiding this comment.
--keep-storage value Keep data below this limit (in MB) (default: 0)
- Replace nuclear 'docker system prune -af' and 'docker volume prune -af' with targeted removal of only the image we just built - Add clean-gcc/clean-clang Makefile targets that remove specific image from both linuxkit cache and docker - Add BuildKit cache pruning with configurable quota via BUILDX_KEEP_STORAGE_MB env var (defaults to 20000 MB) to prevent unbounded disk growth - Print BuildKit disk usage before pruning for diagnostics - Use buildctl inside linuxkit-builder container for cache management - Switch pr-build.yml from pull_request_target to pull_request - Remove unnecessary Docker Hub login from PR builds Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
46abc01 to
160f594
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| on: # yamllint disable-line rule:truthy | ||
| pull_request_target: | ||
| pull_request: | ||
| branches: | ||
| - "gha-test-rebase" | ||
| - "eve-kernel-*" |
There was a problem hiding this comment.
Switching this workflow to pull_request means PR code is executed on runs-on: [runner-xl, X64, Linux] (a self-hosted runner label). If PRs can come from forks, this enables untrusted contributors to run arbitrary code on the self-hosted runner host, which is a high-risk security posture. Consider gating the job to same-repo PRs (e.g., only when github.event.pull_request.head.repo.full_name == github.repository), or moving PR builds to GitHub-hosted runners / a hardened ephemeral runner pool for external contributions.
Summary
Finally move back to
on: pull_requestand remove docker login. Also rework broken clean-up target and properly control disk consumptionReplace the nuclear
docker system prune -af/docker volume prune -afin CI cleanup with targeted removal, preserving BuildKit ccache across runs on self-hosted runners.Changes
Commit 1: Pin actions to commit SHAs and bump to latest versions
actions/checkout: v4 → v6.0.2 (pinned to SHA)docker/login-action: v3 → v4.0.0 (pinned to SHA)Commit 2: Targeted cleanup — preserve BuildKit ccache
docker system prune -afanddocker volume prune -afwith targeted removal of only the image we just builtclean-gcc/clean-clangMakefile targets that remove specific image from both linuxkit cache and dockerBUILDKIT_KEEP_STORAGE_MBenv var (defaults to 32000 MB)buildctlinsidelinuxkit-buildercontainer for cache managementpull_request_targettopull_requestTesting
Tested via PR #205 against
gha-test-rebasebranch on production runners.