Skip to content

Compute worker issues #1205

@Didayolo

Description

@Didayolo

Hopefully solving several points: #2223

1. Containers not removed

  • 11/02/2026: submissions containers staying up forever
Image

2. Wrong log when storage is full

When docker pull fails because of full storage, we have no clear logs.
See:

Then it gets stuck in Running state.

3. Progress bar

Related: show_progress and the progress bar adds up to the mess:

2026-02-28 02:38:37.854 | ERROR    | compute_worker:show_progress:137 - There was an error showing the progress bar
2026-02-28 02:38:37.854 | ERROR    | compute_worker:show_progress:138 - 6
2026-02-28 02:38:37.955 | ERROR    | compute_worker:show_progress:137 - There was an error showing the progress bar
2026-02-28 02:38:37.955 | ERROR    | compute_worker:show_progress:138 - 1

4. Logs

Image

5. No space left

How to manage the disks? Should we limit docker images size?

6. Submissions not marked as Failed

Submissions stuck in "Running" or "Scoring" or status

Related issues:

Example failure during "Preparing":

[2025-09-18 11:25:05,234: ERROR/ForkPoolWorker-2] Task compute_worker_run[fd956bf5-3e2d-4168-ab48-f0896dc80993] raised unexpected: OSError(28, 'No space left on device')
Traceback (most recent call last):
[...]
OSError: [Errno 28] No space left on device

7. Duplication of submission files

8. To check

The log level is defined in this way in compute_worker.py:

configure_logging(
    os.environ.get("LOG_LEVEL", "INFO"), os.environ.get("SERIALIZED", "false")
)

Generally we want as much log as possible, so we may want to be in "DEBUG" log level.



8. Directory structure problem

9. Docker pull failing

  • Docker pull failing
Pull for image: codalab/codalab-legacy:py39 returned a non-zero exit code! Check if the docker image exists on docker hub.

Related issues:

Solution:

10. Logs at the wrong place

11. No hostname in server status when status is "Preparing"

https://www.codabench.org/server_status

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions