Skip to content

Fix concurrent multi-platform push failures in local registry (registry:3)#632

Closed
Copilot wants to merge 2 commits intomasterfrom
copilot/fix-docker-registry-issues
Closed

Fix concurrent multi-platform push failures in local registry (registry:3)#632
Copilot wants to merge 2 commits intomasterfrom
copilot/fix-docker-registry-issues

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 9, 2026

registry:3 has a known race condition with concurrent multi-platform Buildx pushes — blobs are uploaded but not immediately resolvable by subsequent HEAD requests, causing "blob unknown to registry" and "manifest unknown" errors in both build-docker and build-docker-alpine jobs.

Changes

  • Added env block to the registry service in both build-docker and build-docker-alpine jobs:
services:
  registry:
    image: registry:3
    ports:
      - 5000:5000
    env:
      REGISTRY_STORAGE_DELETE_ENABLED: "true"
      REGISTRY_VALIDATION_DISABLED: "true"
      REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY: /var/lib/registry
      REGISTRY_STORAGE_REDIRECT_DISABLE: "true"
  • REGISTRY_VALIDATION_DISABLED — suppresses strict manifest validation that rejects blobs mid-push during concurrent platform writes
  • REGISTRY_STORAGE_REDIRECT_DISABLE — forces direct filesystem reads, eliminating the redirect path where blobs appear missing immediately after upload
  • REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY — pins the storage root explicitly for consistent behavior
  • REGISTRY_STORAGE_DELETE_ENABLED — allows cleanup of partial/failed uploads
Original prompt

Problem

The build-docker and build-docker-alpine jobs in .github/workflows/build.yml are failing because the local Docker registry service uses registry:3, which has known issues with concurrent multi-platform (multi-arch) pushes via Docker Buildx. The registry races on blob uploads — blobs are uploaded but cannot immediately be resolved by subsequent HEAD requests, resulting in "blob unknown to registry" and "manifest unknown" errors.

See PR #626 for context: #626

Required Changes

Apply both of the following fixes to the registry service containers in .github/workflows/build.yml:

Fix 1: Enable compatibility workarounds via environment variables

For both the build-docker job (around line 283-287) and the build-docker-alpine job (around line 347-351), add environment variables to the registry service:

services:
  registry:
    image: registry:3
    ports:
      - 5000:5000
    env:
      REGISTRY_STORAGE_DELETE_ENABLED: "true"
      REGISTRY_VALIDATION_DISABLED: "true"

Fix 2: Configure filesystem storage driver explicitly and disable redirect

For both the build-docker job and the build-docker-alpine job, also add these environment variables:

services:
  registry:
    image: registry:3
    ports:
      - 5000:5000
    env:
      REGISTRY_STORAGE_DELETE_ENABLED: "true"
      REGISTRY_VALIDATION_DISABLED: "true"
      REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY: /var/lib/registry
      REGISTRY_STORAGE_REDIRECT_DISABLE: "true"

So the final result for both build-docker and build-docker-alpine jobs should have the registry service configured as:

services:
  registry:
    image: registry:3
    ports:
      - 5000:5000
    env:
      REGISTRY_STORAGE_DELETE_ENABLED: "true"
      REGISTRY_VALIDATION_DISABLED: "true"
      REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY: /var/lib/registry
      REGISTRY_STORAGE_REDIRECT_DISABLE: "true"

Keep the image: registry:3 — do NOT downgrade to registry:2. The fix is to add the env block with all four environment variables to both registry service definitions.

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: Please find a solution for failing job 70645400141. Use the logs, job definition, and any referenced files where the failure occurred. Keep your response focused on the solution and include code suggestions when appropriate.

Assistant:

Let me start by retrieving the job logs and the PR details to understand the failure.

Assistant: The logs show Docker registry interactions but the actual failure cause is truncated. Let me get more details from the workflow file and the PR itself.

Assistant: ## Failing Job Analysis: build-docker (Job 70645400141)

The failure occurs in the build-docker job during the multi-platform Docker image build and push to the local registry service (localhost:5000). The logs show a pattern of "blob unknown to registry" errors when pushing image layers, followed by eventual "manifest unknown" errors when attempting to push/verify OCI image manifests.

Root Cause

The local Docker registry service is defined on line 285 as:

services:
  registry:
    image: registry:3
    ports:
      - 5000:5000

The registry:3 image is being used, which is the Docker Distribution v3 (based on CNCF Distribution). This version has known issues with concurrent multi-platform (multi-arch) pushes via Docker Buildx. When Buildx pushes multiple platform manifests simultaneously, the registry races on blob uploads — blobs are uploaded but cannot immediately be resolved by subsequent HEAD requests, resulting in "blob unknown to registry" and "manifest unknown" errors.

Solution

Option 1 (Recommended): Pin to registry v2, which is stable for this use case:

services:
  registry:
    image: registry:2
    ports:
      - 5000:5000

This applies to both the build-docker job (line 285) and the build-docker-alpine job (line 349).

Option 2: If registry:3 is required, enable the compatibility workaround by setting environment variables on the registry container:

services:
  registry:
    image: registry:3
    ports:
      - 5000:5000
    env:
      REGISTRY_STORAGE_DELETE_ENABLED: "true"
      REGISTRY_VALIDATION_DISABLED: "true"

Option 3: Use registry:3 with the filesystem storage driver explicitly configured and disable redirect:

services:
  registry:
    image: registry:3
    ports:
   ...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

*This pull request was created from Copilot chat.*
>

Copilot AI changed the title [WIP] Fix Docker registry service configuration for multi-platform pushes Fix concurrent multi-platform push failures in local registry (registry:3) Apr 9, 2026
Copilot AI requested a review from vharseko April 9, 2026 19:07
@vharseko vharseko closed this Apr 10, 2026
@vharseko vharseko deleted the copilot/fix-docker-registry-issues branch April 11, 2026 08:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants