Skip to content

feat: add chunked file upload support Streaming Upload API (rx.upload_files_chunk)#6190

Open
FarhanAliRaza wants to merge 12 commits intoreflex-dev:mainfrom
FarhanAliRaza:chunked-upload
Open

feat: add chunked file upload support Streaming Upload API (rx.upload_files_chunk)#6190
FarhanAliRaza wants to merge 12 commits intoreflex-dev:mainfrom
FarhanAliRaza:chunked-upload

Conversation

@FarhanAliRaza
Copy link
Collaborator

@FarhanAliRaza FarhanAliRaza commented Mar 18, 2026

Implement chunked/streaming file uploads to handle large files without loading them entirely into memory. Moves upload handling logic from app.py to event.py, adds chunked upload JS helpers, and updates the upload component to support the new upload_files_chunk API. Includes unit and integration tests for chunked upload, cancel, and streaming.

All Submissions:

  • Have you followed the guidelines stated in CONTRIBUTING.md file?
  • Have you checked to ensure there aren't any other open Pull Requests for the desired changed?

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

New Feature Submission:

  • Does your submission pass the tests?
  • Have you linted your code locally prior to submission?

Changes To Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

closes #6184

…_files_chunk)

Implement chunked/streaming file uploads to handle large files without
loading them entirely into memory. Moves upload handling logic from
app.py to event.py, adds chunked upload JS helpers, and updates the
upload component to support the new upload_files_chunk API. Includes
unit and integration tests for chunked upload, cancel, and streaming.
@codspeed-hq
Copy link

codspeed-hq bot commented Mar 18, 2026

Merging this PR will improve performance by 3.48%

⚡ 1 improved benchmark
✅ 7 untouched benchmarks

Performance Changes

Benchmark BASE HEAD Efficiency
test_compile_stateful[_stateful_page] 150.9 µs 145.8 µs +3.48%

Comparing FarhanAliRaza:chunked-upload (c4e6690) with main (7ee3026)

Open in CodSpeed

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 18, 2026

Greptile Summary

This PR implements chunked/streaming file uploads via a new rx.upload_files_chunk API. It refactors all upload handling out of app.py into a new reflex/_upload.py module and introduces three public types (UploadChunk, UploadChunkIterator, UploadFilesChunk) that enable large-file uploads without buffering the entire body in memory.

How it fits in: Both the existing buffered path (rx.upload_fileslist[UploadFile]) and the new streaming path (rx.upload_files_chunkUploadChunkIterator) share the same /upload HTTP endpoint and the same "uploadFiles" JS client handler. The server auto-dispatches based on whether the matched event handler is a background task with an UploadChunkIterator annotation. A custom streaming multipart parser (_UploadChunkMultipartParser) reads the request body incrementally and pushes UploadChunk objects into a back-pressured async queue consumed by a @rx.event(background=True) handler, so state updates flow to the browser via the existing WebSocket channel throughout the upload.

Key changes:

  • New reflex/_upload.py with UploadChunk, UploadChunkIterator, streaming multipart parser, and both buffered/chunked route handlers
  • reflex/event.py gains resolve_upload_handler_param, resolve_upload_chunk_handler_param, and UploadFilesChunk — validates handler annotations at event-spec creation time
  • reflex/components/core/upload.py widens on_drop to accept either list[UploadFile] or UploadChunkIterator
  • JS uploadFiles now treats an empty file list as a no-op rather than routing empty uploads back through the WebSocket
  • pyi_generator.py extended to resolve locally-visible types (e.g. UploadChunkIterator) correctly in generated .pyi stubs
  • Integration tests cover full streaming, cancellation, and partial-write verification

Confidence Score: 4/5

  • Safe to merge with one targeted fix and two minor cleanups remaining.
  • The core streaming upload implementation is solid: the UploadChunkIterator uses a proper async condition variable with backpressure, the multipart parser correctly tracks per-file byte offsets, the background task lifecycle is cleanly managed, and both buffered and streaming paths share the same endpoint without breaking existing behaviour. Integration tests cover the happy path and cancellation. The main actionable finding is that FileUpload.as_event_spec silently swallows UploadTypeError, which means misuse of upload_files() on a background handler produces a confusing 500 at runtime rather than a build-time error. The duplicate "uploadFiles" constant across event.py and _upload.py is a minor hygiene issue. Neither concern blocks the feature from working correctly in the intended usage.
  • reflex/event.py (UploadTypeError swallowing in FileUpload.as_event_spec) and reflex/_upload.py (duplicate constant)

Important Files Changed

Filename Overview
reflex/_upload.py New module consolidating all upload handling: introduces UploadChunk, UploadChunkIterator (async producer/consumer with backpressure), a streaming multipart parser, and route-level dispatch between buffered and chunked paths. Duplicate constant _UPLOAD_FILES_CLIENT_HANDLER should be imported from event.py instead.
reflex/event.py Adds resolve_upload_handler_param / resolve_upload_chunk_handler_param helpers, UploadFilesChunk class, and the UPLOAD_FILES_CLIENT_HANDLER constant. FileUpload.as_event_spec silently swallows UploadTypeError, which lets background-handler mismatches reach the server as a 500 instead of failing fast.
reflex/components/core/upload.py Adds _on_drop_args_spec tuple to allow on_drop to accept either list[UploadFile] or UploadChunkIterator, and guards call_event_handler with a client_handler_name check. Introduces _UPLOAD_FILES_CLIENT_HANDLER locally (duplicate of constant in event.py).
reflex/.templates/web/utils/state.js Removes the empty-file early-path that previously routed zero-file uploads over the websocket. Both regular and chunked uploads now always go through the HTTP upload endpoint regardless of file count.
tests/integration/test_upload.py Adds streaming upload handler (handle_upload_stream) and two new integration tests (test_upload_chunk_file, test_cancel_upload_chunk). Magic sleep values (1 s, 12 s) in the cancel test should have comments explaining the time rationale.

Sequence Diagram

sequenceDiagram
    participant Browser
    participant upload.js
    participant state.js
    participant UploadEndpoint as /upload (server)
    participant _upload.py
    participant BackgroundTask
    participant UploadChunkIterator
    participant StateProxy

    Browser->>state.js: user clicks Upload (uploadFilesChunk event)
    state.js->>state.js: applyRestEvent (handler === "uploadFiles")
    state.js->>upload.js: uploadFiles(handler, files, upload_id, ...)
    upload.js->>UploadEndpoint: POST /upload (multipart, Reflex-Event-Handler header)

    UploadEndpoint->>_upload.py: upload_file(request)
    _upload.py->>_upload.py: _require_upload_headers(request)
    _upload.py->>_upload.py: _get_upload_runtime_handler(app, token, handler)
    _upload.py->>_upload.py: handler.is_background → resolve_upload_chunk_handler_param

    _upload.py->>UploadChunkIterator: UploadChunkIterator(maxsize=8)
    _upload.py->>BackgroundTask: app._process_background(state, event{chunk_iter})
    BackgroundTask-->>_upload.py: asyncio.Task
    _upload.py->>UploadChunkIterator: set_consumer_task(task)

    activate BackgroundTask
    BackgroundTask->>UploadChunkIterator: async for chunk in chunk_iter (blocks waiting)

    _upload.py->>_upload.py: _UploadChunkMultipartParser.parse()
    loop Each network chunk from stream
        _upload.py->>UploadChunkIterator: push(UploadChunk)
        UploadChunkIterator->>BackgroundTask: yields chunk (wakes consumer)
        BackgroundTask->>StateProxy: async with self (update progress state)
        StateProxy-->>Browser: state delta via WebSocket
    end

    _upload.py->>UploadChunkIterator: finish()
    BackgroundTask->>BackgroundTask: handler returns (files written)
    BackgroundTask->>StateProxy: async with self (set completed_files)
    StateProxy-->>Browser: final state delta via WebSocket
    deactivate BackgroundTask

    _upload.py-->>upload.js: 202 + StateUpdate(final=True) ndjson
    upload.js-->>Browser: upload complete
Loading

Reviews (2): Last reviewed commit: "refactor: move UploadChunk exports from ..." | Re-trigger Greptile

Copy link
Collaborator

@masenf masenf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if possible, the frontend code should be consolidated. i don't think there's a need to change the frontend code at all. you should be able to detect which type of upload is used in the backend and dispatch to the correct upload type based on the resolved handler arg type

Move upload helpers from reflex/upload.py to reflex/_upload.py, unify
the frontend to use a single uploadFiles function instead of separate
uploadFiles/uploadFilesChunk paths, and normalize upload payload keys
server-side in state.py instead of branching in the JS client.
@FarhanAliRaza
Copy link
Collaborator Author

Don't know why pre-commit is failing.

@masenf
Copy link
Collaborator

masenf commented Mar 18, 2026

Don't know why pre-commit is failing.

The pyi_generator script created files that different from the last known hash. i.e. "something" in the pyi output changed and it looks like it was something that most components are inheriting, so probably in the base component class

EDIT: actually i see that you added UploadFile as default import, so every pyi file got its hash changed. do we need UploadFile as a default import?

@FarhanAliRaza
Copy link
Collaborator Author

EDIT: actually i see that you added UploadFile as default import, so every pyi file got its hash changed. do we need UploadFile as a default import?

No we dont need it in default imports.
But it has another issue it was generating verbose pyi like this
image

even when we have a top-level import UploadFile in upload.pyi it still tries to do an absolute import. but can be made better.
I fixed this in pyi_generator.

@FarhanAliRaza FarhanAliRaza requested a review from masenf March 19, 2026 09:03
…hrough upload endpoint

Move UploadChunk and UploadChunkIterator from reflex.event to reflex._upload,
use lazy imports to break circular dependencies, and remove early-return guards
for empty file lists. Empty uploads now flow through the normal upload path
instead of being short-circuited on the frontend or normalized via websocket
fallback (_normalize_upload_payload removed). Adds tests for empty buffered
and chunk uploads with aliased handler parameters.
Re-export UploadChunk and UploadChunkIterator directly from
reflex._upload instead of re-importing them through reflex.event,
removing the eager import_module call at module load time.
@masenf
Copy link
Collaborator

masenf commented Mar 23, 2026

@greptile-apps please do a final re-review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Streaming Upload API (rx.upload_files_chunk)

2 participants