Description
Two related improvements to docker compose publish when using --app:
1. Parallel + deduplicated image copy
Currently oci.Copy is called serially per service. On large stacks this is slow,
and services sharing the same image trigger duplicate copies.
Proposed: dedupe by normalized image reference, then copy concurrently with bounded
parallelism (default 4), sorting descriptors deterministically after completion.
2. Phase observability
Currently publish only emits "publishing" / "published" events. When it fails,
there's no visibility into which phase failed or how long each phase took.
Proposed: emit structured events per phase (prechecks, push-images, create-layers,
push-manifest, copy-images, push-index) with elapsed time and counts.
Does this approach make sense?