fix: treat container exit code 0 as success in --wait mode#13661
fix: treat container exit code 0 as success in --wait mode#13661
Conversation
When using , containers that exit with code 0 (such as init or oneshot containers) were incorrectly treated as failures, causing the command to return exit code 1. Introduce a typed containerExitError in isServiceHealthy and handle exit code 0 in the RunningOrHealthy wait path. Services with restart policies that restart on clean exit (always, unless-stopped) are excluded via restartsOnExit0 to avoid false positives. Fixes #10596 Signed-off-by: Guillaume Lours <glours@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes docker compose up --wait incorrectly returning exit code 1 when a (non-restarting) init/oneshot container exits successfully (exit code 0), by teaching the wait logic to treat clean exits as success and by introducing a typed error to reliably detect container exit status.
Changes:
- Introduces
containerExitErrorfromisServiceHealthyto carry the container exit code without relying on string parsing. - Updates the
running_or_healthydependency wait path to treat exit code 0 as a successful completion when the service won’t be restarted on clean exit. - Adds unit tests for restart-policy detection and for the new typed exit error behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
pkg/compose/convergence.go |
Adds typed exit error, adds restart-on-exit0 detection, and adjusts --wait dependency handling for clean exits. |
pkg/compose/convergence_test.go |
Extends isServiceHealthy tests to assert typed exit errors and exit code propagation. |
pkg/compose/start_test.go |
Adds tests for restartsOnExit0 across service and deploy restart policy configurations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| switch service.Restart { | ||
| case types.RestartPolicyAlways, types.RestartPolicyUnlessStopped: | ||
| return true | ||
| } | ||
| if service.Deploy != nil && service.Deploy.RestartPolicy != nil { | ||
| switch service.Deploy.RestartPolicy.Condition { | ||
| case "any", "always", "unless-stopped": | ||
| return true |
There was a problem hiding this comment.
restartsOnExit0 compares service.Restart directly to types.RestartPolicyAlways/UnlessStopped, but restart strings can legally include a suffix (e.g. "on-failure:3") and this codebase already parses restart policies via strings.Cut + mapRestartPolicyCondition (see getRestartPolicy in pkg/compose/create.go). As written, values like "always:1" or "any" would be misclassified, which can cause --wait to treat an exit(0) as success even though the service would be restarted. Consider parsing service.Restart the same way as getRestartPolicy and basing the decision on the mapped engine mode (Always/UnlessStopped).
| s.events.On(containerEvents(waitingFor, exited)...) | ||
| return nil |
There was a problem hiding this comment.
In the exit-code-0 success path, this treats the whole dependency as satisfied and emits "exited" events for all containers in waitingFor. If the dependency service is scaled >1 and only one replica exits(0) while others are still starting/unhealthy, waitDependencies would return success prematurely. Consider limiting this shortcut to single-container services, or re-checking every container in waitingFor and only succeeding when they have all cleanly exited(0) (and emitting events only for the containers that actually exited).
| s.events.On(containerEvents(waitingFor, exited)...) | |
| return nil | |
| // Only treat this as a satisfied dependency when exactly one | |
| // container is being waited on. For multi-container (scaled) | |
| // services, we must not mark all containers as exited or | |
| // consider the dependency satisfied based on a single replica. | |
| if len(waitingFor) == 1 { | |
| s.events.On(containerEvents(waitingFor, exited)...) | |
| return nil | |
| } | |
| // For multiple containers, fall through to the standard | |
| // required/optional handling below. |
| // A container that exited with code 0 can be treated as | ||
| // successfully completed (init/oneshot containers) when the | ||
| // service won't be restarted by Docker after a clean exit. | ||
| var exitErr *containerExitError | ||
| if errors.As(err, &exitErr) && exitErr.exitCode == 0 && !restartsOnExit0(project, dep) { | ||
| s.events.On(containerEvents(waitingFor, exited)...) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
The new "exit code 0 treated as success" behavior in the ServiceConditionRunningOrHealthy wait path isn’t covered by tests. There’s currently only minimal coverage for waitDependencies (scale=0 and service_started); adding a unit test that simulates ContainerInspect returning Status=exited/ExitCode=0 and asserts waitDependencies returns nil (and does not fail the overall wait) would help prevent regressions.
|
note to myself: need to check we get this same fix in #13641 |
|
I'm not convinced this is the right approach. IMHO we miss an explicit definition for init containers in the compose spec, which we could rely on to distinguish vs long-running services, and act accordingly. If we had such a service-level attribute declared, we could automatically select |
What I did
When using , containers that exit with code 0 (such as init or oneshot containers) were incorrectly treated as failures, causing the command to return exit code 1.
Introduce a typed containerExitError in isServiceHealthy and handle exit code 0 in the RunningOrHealthy wait path.
Services with restart policies that restart on clean exit (always, unless-stopped) are excluded via restartsOnExit0 to avoid false positives.
Related issue
Fixes #10596
Crafted with Claude Code
(not mandatory) A picture of a cute animal, if possible in relation to what you did
