Skip to content

Add register_stream for cross-thread GPU stream usage#3398

Open
TheTom wants to merge 1 commit intoml-explore:mainfrom
TheTom:feature/register-stream
Open

Add register_stream for cross-thread GPU stream usage#3398
TheTom wants to merge 1 commit intoml-explore:mainfrom
TheTom:feature/register-stream

Conversation

@TheTom
Copy link
Copy Markdown

@TheTom TheTom commented Apr 12, 2026

Summary

After #3348 made CommandEncoder thread-local, evaluating arrays on a thread that did not create the stream fails because the encoder does not exist on the new thread. ThreadLocalStream (#3355) solves this for Python by creating per-thread streams, but there is no way to share a single stream across threads.

register_stream(Stream) registers an existing stream's GPU command encoder on the calling thread, allowing eval() to succeed. It is a no-op if already registered, and is safe to call from any thread including the creating thread.

This is needed for Swift concurrency where the runtime may hop tasks between threads, and for Python threading.Thread workflows that share a stream.

Related: #3078

Changes

  • mlx/stream.h / mlx/stream.cpp: add register_stream(Stream)
  • python/src/stream.cpp: nanobind binding
  • python/tests/test_threads.py: cross-thread eval test, idempotency test

Test plan

  • python/tests/test_threads.py passes (3/3)
  • C++ library builds clean
  • CI

@TheTom TheTom marked this pull request as ready for review April 12, 2026 03:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant