Skip to content

Document zero-copy semantics#2

Open
griffinmilsap wants to merge 5 commits intolow-level-apifrom
zero-copy-semantics
Open

Document zero-copy semantics#2
griffinmilsap wants to merge 5 commits intolow-level-apifrom
zero-copy-semantics

Conversation

@griffinmilsap
Copy link

Adds descriptions of backend transport and notes the zero-copy behavior of messaging (see ezmsg-org/ezmsg#209), highlighting the importance of treating incoming messages as immutable.

Copy link

@KonradPilch KonradPilch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work Griffin!

transport-messaging-internals
axisarray

.. important:: `ezmsg` delivers subscriber messages with zero-copy semantics in all cases. Treat incoming messages as immutable, and copy data before mutating or republishing. See :doc:`transport-messaging-internals` for details and examples.
Copy link

@KonradPilch KonradPilch Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the right place for this (important) note. At some point in the near future, I think I will expand on the high-level design and move that there. I expect that I will additionally refactor that document to have a separate high-level API explanation (to mirror the low-level API document).

GraphServer, Publisher, Subscriber, Channel
===========================================

- **GraphServer**: a lightweight TCP service that tracks the topic DAG, keeps a registry of publishers/subscribers, and notifies subscribers when their upstream publishers change. It also brokers shared-memory segment creation and attachment.
Copy link

@KonradPilch KonradPilch Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few of the terms here like TCP and DAG might need explanation.

My solution: link to a glossary for terms like this (otherwise it would only clutter the text). No need to do this here - for a future PR.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds new documentation explaining ezmsg transport internals and clarifies that subscriber delivery is always effectively zero-copy, emphasizing immutability requirements for received messages.

Changes:

  • Adds a new “Transport and Messaging Internals” explainer page describing runtime components, transport selection, and backpressure.
  • Updates pipeline Unit guidance and overview/explainer pages to highlight always-zero-copy subscriber semantics and link to the new explainer.
  • Tweaks low-level API explainer to add context links and clarify performance expectations.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
docs/source/how-tos/pipeline/unit.rst Replaces the old zero_copy bullet with an “always zero-copy” immutability warning and links to internals.
docs/source/explanations/transport-messaging-internals.rst New detailed explainer for GraphServer/Publisher/Subscriber/Channel, transport paths, backpressure, and zero-copy implications.
docs/source/explanations/low-level-api.rst Adds link to internals page and a performance clarification note.
docs/source/explanations/content-explanations.rst Adds the new internals page to the toctree and an overview warning about zero-copy semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +28 to +32
`ezmsg` uses the fastest transport available per Publisher/Channel pair:

- **Local transport (same process)**: the Publisher pushes the object directly into the Channel (`put_local`), and the Channel stores it in the `MessageCache` without serialization. This is the lowest-overhead path.
- **Shared memory (different process, SHM OK)**: the Publisher serializes the object using `MessageMarshal` (pickle protocol 5 with buffer support), writes it into a ring of shared-memory buffers, and notifies the Channel with a `TX_SHM` message. The Channel reads from shared memory using the message ID and caches the deserialized object.
- **TCP (fallback or forced)**: if SHM is unavailable (attach failed, remote host) or `force_tcp=True`, the Publisher sends a `TX_TCP` payload (header + serialized buffers) directly over the channel socket. The Channel deserializes it and caches the result.
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section uses single-backtick interpreted text for protocol constants and code identifiers (e.g., put_local, MessageCache, MessageMarshal, TX_SHM, TX_TCP). In reStructuredText, inline literals should use double backticks so these render consistently as code and don't get treated as title references.

Copilot uses AI. Check for mistakes.
KonradPilch and others added 4 commits February 17, 2026 19:18
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants