| title | zmqruntime: Distributed Execution Framework with Dual-Channel ZMQ Transport | |||||||
|---|---|---|---|---|---|---|---|---|
| tags |
|
|||||||
| authors |
|
|||||||
| affiliations |
|
|||||||
| date | 15 January 2026 | |||||||
| bibliography | paper.bib |
zmqruntime provides a distributed execution framework for Python using ZMQ (ZeroMQ) as the transport layer. The key innovation is the dual-channel pattern: a control channel for commands (execute, status, cancel) and a data channel for results and streaming output. This separation enables:
- Non-blocking execution: Submit a job, get an execution ID, poll for results asynchronously
- Streaming results: Large arrays stream directly to disk/memory without buffering in memory
- Graceful cancellation: Cancel running jobs without killing the server
- Auto-spawning servers: If no server is running, the client spawns one automatically
from zmqruntime import ExecutionClient
client = ExecutionClient(port=5555)
exec_id = client.execute(pipeline_code, config_code) # Non-blocking
status = client.get_status(exec_id) # Poll for results
results = client.get_results(exec_id) # Retrieve when readyScientific pipelines often require distributed execution: process thousands of images across multiple machines, stream results to visualization tools, and handle failures gracefully. Traditional approaches either use heavyweight frameworks (Celery, Dask) or hand-written socket code.
zmqruntime provides a lightweight middle ground: ZMQ's simplicity with structured message types, dual-channel transport for non-blocking execution, and automatic server spawning for development convenience.
Dual-Channel Transport: Control channel (REQ/REP) for commands; data channel (PUSH/PULL) for results. This prevents head-of-line blocking where a slow result transfer blocks incoming commands.
# Control channel: fast, synchronous
client.execute(pipeline_code, config) # Returns execution_id immediately
# Data channel: asynchronous, streaming
results = client.get_results(execution_id) # Retrieves streamed resultsMessage Type Dispatch: Enum-based message types with automatic handler dispatch. Adding a new command requires only defining the enum and implementing the handler method:
class ControlMessageType(Enum):
EXECUTE = "execute"
STATUS = "status"
CANCEL = "cancel"
def dispatch(self, server, message):
return getattr(server, self.get_handler_method())(message)Execution Tracking: Each execution gets a unique ID. Clients poll for status without blocking. Results are streamed to disk or memory as they arrive.
Auto-Spawning Servers: If no server is running on the specified port, the client automatically spawns one. This enables development workflows where users don't need to manually start servers.
zmqruntime powers distributed execution in OpenHCS, enabling high-throughput screening where thousands of images are processed across multiple worker nodes. The dual-channel design allows the orchestrator to submit new jobs while previous jobs are still streaming results.