Add streaming workflow design doc, renumber design docs by karthikiyer56 · Pull Request #677 · stellar/stellar-rpc

karthikiyer56 · 2026-04-15T23:57:23Z

Add 02-streaming-workflow.md: streaming mode design covering startup validation, first-start .bin loading, per-ledger ingestion loop, three independent sub-flow transitions (LFS, events, txhash), crash recovery invariants, backfill-to-streaming migration, and error handling
Rename 03-backfill-workflow.md → 01-backfill-workflow.md
Update README with new numbering, reading order, and completeness status
Add _ref-old-* files to .gitignore (local reference only)

- Add 02-streaming-workflow.md: streaming mode design covering startup validation, first-start .bin loading, per-ledger ingestion loop, three independent sub-flow transitions (LFS, events, txhash), crash recovery invariants, backfill-to-streaming migration, and error handling - Rename 03-backfill-workflow.md → 01-backfill-workflow.md - Update README with new numbering, reading order, and completeness status - Add _ref-old-* files to .gitignore (local reference only)

Copilot

Pull request overview

Adds and renumbers Full History “design-docs” documentation to cover both backfill and streaming ingestion workflows, and updates supporting references/ignore rules.

Changes:

Added a new streaming workflow design doc (startup validation, ingestion loop, transitions, recovery invariants).
Renumbered/added the backfill workflow doc to 01-* and updated the design-docs README with new reading order/status.
Updated .gitignore to ignore local _ref-old-* reference files.

Reviewed changes

Copilot reviewed 2 out of 4 changed files in this pull request and generated 2 comments.

File	Description
`full-history/design-docs/README.md`	Reworked overview, doc list/status, reading order, and shared concepts.
`full-history/design-docs/02-streaming-workflow.md`	New streaming-mode design covering ingestion, transitions, and crash recovery.
`full-history/design-docs/01-backfill-workflow.md`	Backfill workflow design doc under new numbering.
`.gitignore`	Ignores local `_ref-old-*` files under `full-history/design-docs/`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-16T00:01:09Z

+    if stored_cpi is None:
+        # First ever run writes the value. Backfill writes this on first run;
+        # if streaming runs first (no prior backfill), streaming writes it.
+        meta_store.put("config:chunks_per_txhash_index", config.chunks_per_txhash_index)


Step 1 says streaming can be the first-ever run and will write config:chunks_per_txhash_index, but Step 2 immediately fatals if no backfill chunk data exists (“run backfill first”). Please reconcile this (either document that backfill is required before streaming, or describe the supported streaming-from-scratch bootstrap behavior).

Copilot · 2026-04-16T00:01:09Z

+    # 4. If index boundary: trigger index-level transitions.
+    #    The index boundary ledger is always also a chunk boundary ledger.
+    current_index = current_chunk // chunks_per_txhash_index
+    if ledger_seq == index_last_ledger(current_index):


In process_ledger, chunks_per_txhash_index is referenced but not defined in the pseudocode’s scope. To keep the design unambiguous, pass it in (e.g., via config) or reference config.chunks_per_txhash_index as done elsewhere in the doc.

- Replace DAG mermaid diagram with pseudocode dependency comments - Convert remaining prose paragraphs to bullet lists - Restructure crash recovery invariants as bullet sub-lists - Fix section heading from "First-Boot TxHash Store" to "Load Backfill TxHash Data into RocksDB" - Replace all "first boot" with "first start in streaming mode" - Add inline comments explaining boundary math (subtract 2, chunk_last_ledger formula) - Add concise main flow (run_streaming) near top of doc - Add dynamic vs static DAG explanation - Call out .bin loading as one-time cost

- .bin files only exist if backfill left a partial txhash index - If backfill ended on an index boundary, step 3 is a no-op - Remove language implying .bin loading always happens on first start

- Events system persists (term_key, event_id) pairs per ledger in the embedded DB for crash recovery. On restart, deltas are replayed to rebuild in-memory bitmaps. - WAL is an internal DB mechanism — nobody reads it directly. - Replace all "WAL-backed deltas" / "Events WAL" with "persisted deltas" / "persisted index deltas" for events-specific references. - RocksDB WAL references for ledger/txhash stores remain unchanged (correct usage).

karthikiyer56 requested a review from Copilot April 15, 2026 23:57

Copilot started reviewing on behalf of karthikiyer56 April 15, 2026 23:58 View session

Copilot AI reviewed Apr 16, 2026

View reviewed changes

karthikiyer56 added 3 commits April 15, 2026 17:25

Fix .bin loading language: not always present on first start

3e4a2cf

- .bin files only exist if backfill left a partial txhash index - If backfill ended on an index boundary, step 3 is a no-op - Remove language implying .bin loading always happens on first start

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add streaming workflow design doc, renumber design docs#677

Add streaming workflow design doc, renumber design docs#677
karthikiyer56 wants to merge 4 commits intofeature/full-historyfrom
karthik/streaming-design-doc

karthikiyer56 commented Apr 15, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 16, 2026

Uh oh!

Copilot AI Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

karthikiyer56 commented Apr 15, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants