feat(coglet): IPC bridge enhancements for file outputs, large payloads, and upload URL#2746
Open
tempusfrangit wants to merge 13 commits intomainfrom
Open
feat(coglet): IPC bridge enhancements for file outputs, large payloads, and upload URL#2746tempusfrangit wants to merge 13 commits intomainfrom
tempusfrangit wants to merge 13 commits intomainfrom
Conversation
This commit truncates log lines at 4MiB of data and adds a truncation notice to the log line. This protects against a panic/slot poisoning that can happen if we exceed the codec configured (8MiB) size. Any log line that exceeds 1MiB boarders on useless. To ensure that even the most insane log lines are kept without disruption, we have implemented 4MiB limit.
Outputs > 6MiB are spilled to disk and sent as FileOutput path
references over IPC instead of inline, avoiding the 8MiB
LengthDelimitedCodec frame limit. Coglet creates per-prediction
output dirs at /tmp/coglet/outputs/{prediction_id}/ and passes
the path to the worker via SlotRequest::Predict.
When the worker spills output to disk (FileOutput), the orchestrator reads it back and integrates it into the prediction via append_output. Handles both Oversized (JSON spill) and FileType (path reference) variants.
Generator outputs were collected into a Vec and bundled into a single Done frame, which could exceed the 8MiB IPC limit. Now each yield is sent immediately via slot_sender.send_output(), streaming through SlotResponse::Output frames. The Done frame carries an empty array for generators since outputs were already streamed. Plumbs slot_sender from PythonPredictHandler::predict through to process_generator_output and process_async_result for both predict and train paths.
File-type outputs (os.PathLike, io.IOBase) are now detected in the worker and written to the per-prediction output dir instead of being base64-encoded in-process. This keeps network I/O out of the worker subprocess — the parent process handles uploads. - Add SlotSender::write_file_output(bytes, ext) for language-agnostic file writing from any FFI worker (Python, Node, etc.) - Add send_output_item() helper that routes by type: PathLike sends existing path, IOBase reads bytes and writes via SlotSender, everything else goes through the existing JSON serialization path - Update process_single_output to also detect and route file types
The orchestrator's FileOutput handler now distinguishes between
FileOutputKind::Oversized (JSON spill, deserialize as before) and
FileOutputKind::FileType (binary file, base64-encode as data URI).
This is the fallback path when no upload_url is configured. File
outputs from the worker are read from disk and encoded as
data:{mime};base64,{data} URIs using mime_guess for content type
detection.
Add mime_type: Option<String> to SlotResponse::FileOutput so FFI workers can pass an explicit MIME type. When None, the orchestrator falls back to mime_guess from the file extension (matching old cog behavior). Plumbing is in place for future use — all current callers pass None.
Thread --upload-url CLI arg through coglet.server.serve() to the orchestrator event loop. When set, file outputs are PUT to the signed endpoint per-yield as received (matching old Python cog behavior). - upload_file() does PUT with Content-Type, extracts Location header, strips query params, follows redirects via reqwest - Uploads are spawned as tokio tasks so they don't block the event loop - Done handler awaits pending uploads before finalizing the prediction - Failed/Cancelled/Error handlers abort pending uploads immediately - Falls back to base64 data URI encoding when no upload_url is set
- Root .venv is now created by _setup_venv task (used by build:coglet, build:coglet:wheel, build:sdk, generate:stubs, stub:check) - _.python.venv uses REPO_ROOT so all tasks share the same venv - mise clean:python removes .venv, coglet-python/.venv, and *.so - stub_gen.rs: write coglet/_impl.pyi (not __init__.pyi) so the native module stubs don't get overwritten by mypy stubgen - module_variable! declarations for __version__, __build__, server so ty can resolve members imported from coglet._impl - stub:check depends on generate:stubs to avoid duplicating logic - #[allow(clippy::too_many_arguments)] on serve_impl (8 args after upload_url)
…verflow Output is now always sent as a separate message before Done, with automatic spill-to-disk for values exceeding the 6MiB IPC frame limit. The orchestrator reconstructs the final output from accumulated per-yield values (generators, file uploads) rather than from the Done message payload. This fixes three issues: - Large single outputs (>8MiB) caused "frame size too big" panics that poisoned slots, since the entire output was embedded in the Done message - Generator/iterator outputs were lost because per-yield values accumulated in outputs() were ignored by the Done handler in favor of the empty Stream(vec![]) from the worker - File upload outputs (Path, IOBase) yielded from generators were silently dropped for the same reason Also adds --upload-url flag to `cog serve`, mock upload server to the integration test harness, and three new integration tests: - coglet_large_output: 9MiB string output (would poison slot without spill) - coglet_iterator_upload_url: generator Path yields uploaded to --upload-url - coglet_iterator_path_output: generator Path yields as base64 data URIs
9e39dc4 to
c394d6d
Compare
…tting cog predict writes file outputs to disk, not as base64 JSON to stdout. Fix assertion to check stderr for "Written output to" messages.
c394d6d to
1390b59
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This branch addresses several interrelated issues in the coglet IPC bridge that caused file-type prediction outputs to be silently lost, large outputs to poison worker slots, and upload URLs to be ignored.
Key changes
File output routing through IPC: Generator yields of
cog.Pathandio.IOBaseare now detected in the worker, written to disk, and sent asFileOutputmessages over the bridge. The orchestrator reads them back, either uploading to a signed endpoint (when--upload-urlis set) or base64-encoding as data URIs.Large output spilling: Prediction outputs exceeding the 6 MiB IPC limit (to ensure we are under the 8 MiB frame limit) are automatically spilled to disk and sent as
FileOutput(Oversized)references. The orchestrator reconstructs the original JSON value transparently. This prevents the "frame size too big" panic that previously poisoned slots.Output decoupled from Done: The worker now sends output as a separate
SlotResponse::Outputmessage beforeDone, rather than embedding it in the Done payload. The orchestrator builds the final output from accumulated per-yield values. This fixes generators returning empty output arrays and removes the frame size constraint from Done messages entirely.Upload URL wiring:
--upload-urlis accepted bycog serve, threaded through the Python entrypoint to coglet's orchestrator, and used for async PUT uploads of file outputs. Each upload runs in a spawned task; the Done handler awaits all pending uploads before marking the prediction as succeeded.Log size enforcement: Worker logs are truncated at 4 MiB to stay under the 8 MiB bridge frame limit.
Stub generation and venv setup: Fixed mise tasks for Python type stub generation, venv creation, and editable coglet installs.
Integration tests
Three new integration tests verify the end-to-end behavior:
coglet_large_outputcoglet_iterator_upload_urlPathobjects uploaded per-yield to--upload-urlcoglet_iterator_path_outputPathobjects returned as base64 data URIsFiles changed (highlights)
bridge/protocol.rsFileOutputvariant withOversized/FileTypekinds, optionalmime_typeworker.rsSlotSenderfile output methods,build_output_messagespill logic, output sent separately from Doneorchestrator.rsFileOutputhandling (upload or base64), upload barrier for Done, accumulated output reconstructionprediction.rstake_outputs()for draining accumulated per-yield valuespredictor.rssend_output_item/send_file_outputfor Path and IOBase detection in generatorsserve.go--upload-urlflag withhost.docker.internalnetworkingcommand.go,docker.goExtraHostssupport inRunOptionsharness.goupload-server-start/upload-server-countcommandsTest plan
mise run test:rust-- 109 tests passmise run test:integration coglet_large_output-- 9 MiB output spills and reconstructs correctlymise run test:integration coglet_iterator_upload_url-- per-yield uploads to mock servermise run lint:rust-- cleangolangci-lint v2.8.0 run ./...-- clean