Conversation
…ByMarkers, getVtxosFromCacheOrDB
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds marker-based DAG scaling and cursored pagination to VTXO chain traversal: VTXOs gain Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Indexer as Indexer Service
participant Cache as Per-Page Cache
participant VtxoRepo as Vtxo Repo
participant MarkerRepo as Marker Repo
participant OffchainRepo as Offchain Tx Repo
Client->>Indexer: GetVtxoChain(outpoint, pageToken)
Indexer->>Indexer: decodeChainCursor(pageToken) / derive frontier
Indexer->>Cache: ensureVtxosCached(frontier)
Cache->>VtxoRepo: GetVtxos(missing outpoints)
VtxoRepo-->>Cache: return vtxos (with Depth, MarkerIDs)
Indexer->>MarkerRepo: GetMarkersByIds(markerIDs) -- preload
MarkerRepo-->>Indexer: return marker parents
Indexer->>VtxoRepo: GetVtxosByMarker(markerID) -- gap-fill
VtxoRepo-->>Indexer: return related vtxos
alt preconfirmed txs present
Indexer->>OffchainRepo: GetOffchainTxsByTxids(txids)
OffchainRepo-->>Indexer: return offchain txs
end
Indexer->>Indexer: build response, encodeChainCursor(remaining)
Indexer-->>Client: Chain[], NextPageToken
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@internal/core/application/indexer.go`:
- Around line 408-413: The loop that walks parent markers (using
marker.ParentMarkerIDs and i.repoManager.Markers().GetMarker) can infinite-loop
on cyclic parent chains; add a visited set (map[string]struct{}) keyed by marker
ID and check it before appending IDs or calling GetMarker to break cycles, and
also deduplicate ParentMarkerIDs when appending to markerIDs so you don't re-add
the same ID; update the loop to mark the current marker ID as visited, skip any
parent IDs already visited, and stop traversal if the next parent is seen.
In `@internal/infrastructure/db/badger/marker_repo.go`:
- Around line 501-514: GetVtxoChainByMarkers currently does a full table scan
via r.vtxoStore.Find(&dtos, &badgerhold.Query{}) and filters in-memory; change
it to query by marker IDs to avoid loading all vtxos: iterate markerIDs (or
batch them) and call r.vtxoStore.Find with badgerhold.Where("MarkerID").Eq(id)
for each id (or badgerhold.Where("MarkerID").In(batch) if supported), collect
matched vtxoDTOs, convert dto.Vtxo into the vtxos slice and return; ensure you
still respect markerIDSet and handle errors per-query.
In
`@internal/infrastructure/db/postgres/migration/20260210100000_add_vtxo_depth.up.sql`:
- Around line 8-13: The Postgres view vtxo_vw currently returns NULL for the
commitments column when no rows exist because it uses
string_agg(vc.commitment_txid, ','); change the SELECT to wrap string_agg with
COALESCE (e.g., COALESCE(string_agg(...), '')) so commitments always yields an
empty string like the SQLite view; update the SELECT that references vtxo,
vtxo_commitment_txid and the commitments alias to use
COALESCE(string_agg(vc.commitment_txid, ','), '').
In `@internal/infrastructure/db/service.go`:
- Around line 752-821: sweepVtxosWithMarkers currently marks a marker swept
before guaranteeing the marker's VTXOs were successfully removed, risking
inconsistent state; change the ordering so you attempt to sweep the marker's
VTXOs first (use markerStore.SweepVtxosByMarker and fall back to
vtxoStore.SweepVtxos for markerVtxos[markerID]) and only if that sweep returns
success call markerStore.SweepMarker(markerID, sweptAt); on any sweep error keep
the marker unmarked, log the failure, and accumulate the fallback count as now
done — update the loop in sweepVtxosWithMarkers to perform
SweepVtxosByMarker/SweepVtxos before calling SweepMarker and adjust error
handling accordingly.
🧹 Nitpick comments (12)
internal/test/e2e/utils_test.go (1)
742-744: Acknowledge the TODO placeholder.The TODO is clear about the dependency on the SDK proto package exposing
Depth. Consider tracking this with a GitHub issue so it doesn't get lost.Would you like me to open a GitHub issue to track re-enabling
setupRawIndexerClientandgetVtxoDepthByOutpointonce the SDK proto exposesDepth?internal/core/domain/marker_test.go (2)
9-35: Good coverage of boundary cases.The table-driven test covers a solid range including edges (0, 99, 100, 101). Consider using
t.Runwith a subtest name for each case to get more granular test output on failure.♻️ Optional: use subtests for better diagnostics
for _, tt := range tests { + t.Run(fmt.Sprintf("depth_%d", tt.depth), func(t *testing.T) { result := IsAtMarkerBoundary(tt.depth) require.Equal(t, tt.expected, result, "IsAtMarkerBoundary(%d) should be %v", tt.depth, tt.expected) + }) }(You'd need to add
"fmt"to imports.)
42-55: These tests only verify struct literal construction, not behavior.
TestMarkerStructandTestSweptMarkerStructtest that Go struct fields hold the values you assign — they don't test any domain logic. They're fine as documentation of the data model but provide no regression protection. Consider adding tests for actual marker operations (creation, parent resolution, etc.) as the marker logic matures.internal/infrastructure/db/postgres/migration/20260211020000_add_markers.up.sql (1)
2-6: Consider addingNOT NULL DEFAULT '[]'::jsonbtoparent_markers.The column currently allows NULL, which means application code must handle both NULL and empty array. Using a NOT NULL default simplifies queries and Go code that deserializes this field.
♻️ Suggested change
CREATE TABLE IF NOT EXISTS marker ( id TEXT PRIMARY KEY, depth INTEGER NOT NULL, - parent_markers JSONB -- JSON array of parent marker IDs + parent_markers JSONB NOT NULL DEFAULT '[]'::jsonb -- JSON array of parent marker IDs );internal/core/domain/marker_repo.go (2)
5-44: Large interface — consider whether it could be split, and watch for unbounded queries.The interface has 16 methods mixing marker lifecycle, sweep operations, and VTXO queries. This is functional but may violate the Interface Segregation Principle as it grows.
More concretely, methods like
GetVtxosByMarker,GetVtxosByDepthRange, andGetVtxoChainByMarkers(lines 30, 37, 41) return unbounded[]Vtxoslices. If marker/depth ranges can span many VTXOs, callers may hit memory pressure. Consider whether pagination or a limit parameter is warranted for these, especiallyGetVtxosByDepthRangewhich could span a very wide range.
6-7: Clarify upsert semantics in the doc comment.The comment says "creates or updates a marker," but the method signature uses
erroras the only signal. It may be useful to document whether an update replacesParentMarkerIDsentirely or merges, and whether updating a marker that has already been swept is allowed.internal/infrastructure/db/sqlite/migration/20260211000000_add_markers.up.sql (1)
2-6: Consider adding an index onparent_markersfor BFS descendant lookups.The
markertable storesparent_markersas a JSON text column. The Badger implementation does BFS by querying markers whoseParentMarkerIDscontains a given ID. If a similar query pattern is used in SQLite (e.g., usingjson_eachto find children), performance could degrade without an index strategy. This is fine for now if the query load is low, but worth keeping in mind.internal/infrastructure/db/badger/marker_repo.go (2)
42-106: Constructor usesinterface{}variadic config — consider a typed options struct.The
NewMarkerRepository(config ...interface{})pattern with positionalinterface{}arguments is fragile and hard to use correctly. While this matches the existing codebase pattern (e.g.,NewVtxoRepository), a typed config struct would be safer. This is fine for now if consistency with the existing pattern is preferred.
116-136: Retry loop doesn't respect context cancellation.The retry loops (here and in similar patterns at lines 243, 390, 435) sleep unconditionally without checking
ctx.Done(). If the context is cancelled, the function will still retry up tomaxRetriestimes with 100ms sleeps. This is a minor concern given the small retry count.internal/infrastructure/db/sqlite/marker_repo.go (1)
141-166: Make descendant sweeping atomic to avoid partial state.If an insert fails mid-loop, some markers are swept and others aren’t. Wrapping the inserts in a single transaction avoids partial sweeps and reduces round-trips.
♻️ Suggested transaction wrapper
func (m *markerRepository) SweepMarkerWithDescendants( ctx context.Context, markerID string, sweptAt int64, ) (int64, error) { // Get all descendant marker IDs (including the root marker) that are not already swept descendantIDs, err := m.querier.GetDescendantMarkerIds(ctx, markerID) if err != nil { return 0, fmt.Errorf("failed to get descendant markers: %w", err) } - // Insert each descendant into swept_marker - var count int64 - for _, id := range descendantIDs { - err := m.querier.InsertSweptMarker(ctx, queries.InsertSweptMarkerParams{ - MarkerID: id, - SweptAt: sweptAt, - }) - if err != nil { - return count, fmt.Errorf("failed to sweep marker %s: %w", id, err) - } - count++ - } - - return count, nil + tx, err := m.db.BeginTx(ctx, nil) + if err != nil { + return 0, err + } + q := queries.New(tx) + defer tx.Rollback() + + var count int64 + for _, id := range descendantIDs { + if err := q.InsertSweptMarker(ctx, queries.InsertSweptMarkerParams{ + MarkerID: id, + SweptAt: sweptAt, + }); err != nil { + return count, fmt.Errorf("failed to sweep marker %s: %w", id, err) + } + count++ + } + if err := tx.Commit(); err != nil { + return 0, err + } + return count, nil }internal/infrastructure/db/postgres/marker_repo.go (1)
144-169: Make descendant sweeping atomic to avoid partial state.Same concern as the sqlite implementation—if the loop fails mid-way, markers can be partially swept.
♻️ Suggested transaction wrapper
func (m *markerRepository) SweepMarkerWithDescendants( ctx context.Context, markerID string, sweptAt int64, ) (int64, error) { // Get all descendant marker IDs (including the root marker) that are not already swept descendantIDs, err := m.querier.GetDescendantMarkerIds(ctx, markerID) if err != nil { return 0, fmt.Errorf("failed to get descendant markers: %w", err) } - // Insert each descendant into swept_marker - var count int64 - for _, id := range descendantIDs { - err := m.querier.InsertSweptMarker(ctx, queries.InsertSweptMarkerParams{ - MarkerID: id, - SweptAt: sweptAt, - }) - if err != nil { - return count, fmt.Errorf("failed to sweep marker %s: %w", id, err) - } - count++ - } - - return count, nil + tx, err := m.db.BeginTx(ctx, nil) + if err != nil { + return 0, err + } + q := queries.New(tx) + defer tx.Rollback() + + var count int64 + for _, id := range descendantIDs { + if err := q.InsertSweptMarker(ctx, queries.InsertSweptMarkerParams{ + MarkerID: id, + SweptAt: sweptAt, + }); err != nil { + return count, fmt.Errorf("failed to sweep marker %s: %w", id, err) + } + count++ + } + if err := tx.Commit(); err != nil { + return 0, err + } + return count, nil }internal/infrastructure/db/sqlite/sqlc/query.sql (1)
467-480: Replace LIKE-based JSON matching with json_each() for robustness.The recursive CTE uses
m.parent_markers LIKE '%"' || dm.id || '"%'to check if a marker ID exists in a JSON array. While this works in practice, it's fragile: marker IDs containing SQL LIKE wildcards (%,_) would cause incorrect matches since the code doesn't escape them. The Postgres version correctly uses@> jsonb_build_array(dm.id)(line 473); SQLite should use the equivalentjson_each()for consistency and correctness:Suggested replacement
- SELECT m.id FROM marker m - INNER JOIN descendant_markers dm ON ( - m.parent_markers LIKE '%"' || dm.id || '"%' - ) + SELECT m.id FROM marker m + INNER JOIN descendant_markers dm ON EXISTS ( + SELECT 1 FROM json_each(m.parent_markers) je WHERE je.value = dm.id + )
| CREATE VIEW vtxo_vw AS | ||
| SELECT v.*, string_agg(vc.commitment_txid, ',') AS commitments | ||
| FROM vtxo v | ||
| LEFT JOIN vtxo_commitment_txid vc | ||
| ON v.txid = vc.vtxo_txid AND v.vout = vc.vtxo_vout | ||
| GROUP BY v.txid, v.vout; |
There was a problem hiding this comment.
Inconsistent NULL handling between Postgres and SQLite views.
The SQLite vtxo_vw uses COALESCE(group_concat(...), '') (returns empty string when no commitments), but this Postgres view uses bare string_agg(...) (returns NULL when no commitments). This can cause behavioral differences across backends if downstream code doesn't uniformly handle both NULL and empty string.
Consider wrapping with COALESCE for consistency:
Suggested fix
CREATE VIEW vtxo_vw AS
-SELECT v.*, string_agg(vc.commitment_txid, ',') AS commitments
+SELECT v.*, COALESCE(string_agg(vc.commitment_txid, ','), '') AS commitments
FROM vtxo v
LEFT JOIN vtxo_commitment_txid vc
ON v.txid = vc.vtxo_txid AND v.vout = vc.vtxo_vout
GROUP BY v.txid, v.vout;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| CREATE VIEW vtxo_vw AS | |
| SELECT v.*, string_agg(vc.commitment_txid, ',') AS commitments | |
| FROM vtxo v | |
| LEFT JOIN vtxo_commitment_txid vc | |
| ON v.txid = vc.vtxo_txid AND v.vout = vc.vtxo_vout | |
| GROUP BY v.txid, v.vout; | |
| CREATE VIEW vtxo_vw AS | |
| SELECT v.*, COALESCE(string_agg(vc.commitment_txid, ','), '') AS commitments | |
| FROM vtxo v | |
| LEFT JOIN vtxo_commitment_txid vc | |
| ON v.txid = vc.vtxo_txid AND v.vout = vc.vtxo_vout | |
| GROUP BY v.txid, v.vout; |
🤖 Prompt for AI Agents
In
`@internal/infrastructure/db/postgres/migration/20260210100000_add_vtxo_depth.up.sql`
around lines 8 - 13, The Postgres view vtxo_vw currently returns NULL for the
commitments column when no rows exist because it uses
string_agg(vc.commitment_txid, ','); change the SELECT to wrap string_agg with
COALESCE (e.g., COALESCE(string_agg(...), '')) so commitments always yields an
empty string like the SQLite view; update the SELECT that references vtxo,
vtxo_commitment_txid and the commitments alias to use
COALESCE(string_agg(vc.commitment_txid, ','), '').
… to use BulkSweepMarkers
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
internal/infrastructure/db/badger/vtxo_repo.go (2)
101-119:⚠️ Potential issue | 🟠 Major
GetVtxosreturnsnil, nilwhen any single outpoint is missing, discarding all previously collected vtxos.
getVtxoreturns(nil, nil)forErrNotFound(Line 476-477), so thestrings.Contains(err.Error(), "not found")check on Line 108 is dead code for that path. Instead, execution falls through to Line 113 wherevtxo == niltriggersreturn nil, nil, silently dropping all vtxos already appended. This shouldcontinueinstead, matching the likely intent of skipping missing outpoints.Proposed fix
for _, outpoint := range outpoints { vtxo, err := r.getVtxo(ctx, outpoint) if err != nil { - if strings.Contains(err.Error(), "not found") { - continue - } return nil, err } if vtxo == nil { - return nil, nil + continue } vtxos = append(vtxos, *vtxo) }
277-322:⚠️ Potential issue | 🟡 MinorInconsistent filter: query uses
Ge(>=) but post-filter uses>(strictly greater).Lines 286-288 and 295-297 fetch vtxos with
Amount >= amountFilter, but Lines 306 and 311 then exclude vtxos whereAmount == amountFilterby checkingvtxo.Amount > amountFilter. This means vtxos with amount exactly equal to the filter are fetched from the DB but silently dropped. Either the query should useGtor the post-filter should use>=.
🤖 Fix all issues with AI agents
In `@internal/core/application/service_test.go`:
- Around line 149-154: The test case "no spent vtxos" expects depth 1 but the
service's logic leaves newDepth at 0 when spentOutpoints is empty; update the
test in service_test.go for the case named "no spent vtxos (empty)" to set
expectedDepth to 0 so it matches the actual behavior of the service (referencing
newDepth, maxDepth and spentOutpoints in the service implementation).
In `@internal/infrastructure/db/badger/marker_repo.go`:
- Around line 258-276: SweepMarker currently does a full table scan by calling
r.vtxoStore.Find(&allDtos, &badgerhold.Query{}) for every marker (and
BulkSweepMarkers calls SweepMarker in a loop), causing N full scans; change
SweepMarker to query only VTXOs that contain the marker by using
r.vtxoStore.Find(&filteredDtos,
badgerhold.Where("MarkerIDs").Contains(markerID)) (same pattern as
getDescendantMarkerIds), iterate filteredDtos (type vtxoDTO) and call
r.vtxoStore.Update(outpoint.String(), dto) to set Swept=true and UpdatedAt; this
ensures each marker triggers a targeted query instead of scanning all VTXOs and
avoids the N×full-scan behavior in BulkSweepMarkers.
- Around line 438-462: GetVtxosByMarker currently loads all VTXOs then filters
in memory; change the find to use an indexed query so Badger filters by
MarkerIDs: replace the badgerhold.Query{} call in
markerRepository.GetVtxosByMarker with
badgerhold.Where("MarkerIDs").Contains(markerID) (keeping the same
r.vtxoStore.Find(&dtos, query) pattern), then retain the existing loop to
compute vtxo.Swept via r.isAnyMarkerSwept(dto.MarkerIDs) and append matching
DTOs to the result slice.
In `@internal/infrastructure/db/postgres/marker_repo.go`:
- Around line 159-184: SweepMarkerWithDescendants does inserts in a loop without
a transaction, causing partial commits on failure; wrap the entire operation in
a DB transaction so either all descendant InsertSweptMarker calls succeed or
none do. Start a transaction (e.g., via m.db.BeginTx or your repo's transaction
helper), run GetDescendantMarkerIds and then perform each
queries.InsertSweptMarker using the transactional querier/context (or passing tx
into the querier methods), rollback on any error and commit at the end, and
return the count only after a successful commit; reference functions:
SweepMarkerWithDescendants, GetDescendantMarkerIds, and InsertSweptMarker.
In
`@internal/infrastructure/db/postgres/migration/20260210100000_add_depth_and_markers.down.sql`:
- Around line 1-14: Move the view drops to before any column/table drops: drop
views intent_with_inputs_vw and vtxo_vw first, then drop index idx_vtxo_markers,
drop columns markers and depth from table vtxo, and finally drop marker and
swept_marker tables; update the script so vtxo_vw and intent_with_inputs_vw are
removed prior to altering vtxo to avoid PostgreSQL dependency errors.
In
`@internal/infrastructure/db/postgres/migration/20260210100000_add_depth_and_markers.up.sql`:
- Around line 1-5: Change the new column definition for markers on table vtxo to
be non-nullable with a default empty JSON array by altering the ADD COLUMN
statement for markers to "ADD COLUMN IF NOT EXISTS markers JSONB NOT NULL
DEFAULT '[]'::jsonb" (keep the existing GIN index creation), and ensure any
separate backfill step that populates markers for existing rows is consistent
with this default (remove/adjust redundant backfill or ensure it uses
'[]'::jsonb for rows without markers).
In `@internal/infrastructure/db/postgres/sqlc/queries/query.sql.go`:
- Around line 1770-1776: The SQL in the constant selectVtxosByArkTxid used by
the method SelectVtxosByArkTxid filters on the wrong column (txid); update the
query string to use WHERE ark_txid = $1 (or WHERE ark_txid = `@ark_txid` in the
.sql source) so the function returns VTXOs created by the given ark transaction;
update the selectVtxosByArkTxid SQL in both the Postgres and SQLite query.sql
sources so the generated query and the Queries.SelectVtxosByArkTxid
implementation both filter on ark_txid instead of txid.
- Around line 118-131: The recursive CTE used by GetDescendantMarkerIds scans
marker.parent_markers with the jsonb containment operator (@>), causing repeated
sequential scans; add a migration that creates a GIN index on the parent_markers
column (marker.parent_markers) so the recurrence m.parent_markers @>
jsonb_build_array(dm.id) can use the index; implement the migration file that
executes CREATE INDEX IF NOT EXISTS idx_marker_parent_markers ON marker USING
GIN (parent_markers) and ensure it is applied in your migrations pipeline.
In `@internal/infrastructure/db/service.go`:
- Around line 210-219: The code appends badgerVtxoRepo.GetStore() onto
config.DataStoreConfig which can mutate the original slice's backing array;
instead create a new slice copy of config.DataStoreConfig before appending to
avoid side effects. Locate the block that builds markerConfig (using
config.DataStoreConfig, badgerVtxoRepo.GetStore() and markerStoreFactory) and
replace the direct append with creating a new slice sized to hold the elements,
copying config.DataStoreConfig into it, then append badgerVtxoRepo.GetStore() to
that new slice and pass the new slice to markerStoreFactory.
- Around line 492-496: CreateRootMarkersForVtxos failures are currently only
warned and can leave persisted VTXOs referencing missing markers; update the
block where s.markerStore.CreateRootMarkersForVtxos(ctx, newVtxos) is called to
either (a) retry the CreateRootMarkersForVtxos call with the same retry/backoff
strategy used for persisting VTXOs (mirror the loop around the VTXO
persistence), or (b) if retrying fails, return the error to fail-fast so the
caller can roll back or handle incomplete state; locate the call to
s.markerStore.CreateRootMarkersForVtxos and implement a retry loop (or propagate
the error) and ensure logs include context about the affected VTXO set when
giving up.
- Around line 538-577: The GetVtxos DB failure path leaves newDepth at 0 and
parentMarkerIDs nil which makes IsAtMarkerBoundary(0) treat chained VTXOs as
root markers; update the error path in the block that calls s.vtxoStore.GetVtxos
(inside the loop over offchainTx.CheckpointTxs and subsequent processing) to
avoid creating misleading root markers by either returning the error upward
(propagate the GetVtxos error) or setting newDepth to a sentinel (e.g., a
special unknown value) and ensuring downstream logic
(IsAtMarkerBoundary/newDepth handling) treats that sentinel as “unknown” (no
root marker creation) instead of depth 0, and document the chosen approach in
the same function where newDepth and parentMarkerIDs are computed.
In
`@internal/infrastructure/db/sqlite/migration/20260210000000_add_depth_and_markers.down.sql`:
- Around line 28-33: The down migration tries to copy the removed swept column
from vtxo into vtxo_temp causing "no such column: swept"; fix by reconstructing
swept or restoring the column before the INSERT: either add a temporary swept
column to vtxo (or vtxo_temp) prior to the INSERT (so INSERT INTO vtxo_temp
SELECT ... swept ... FROM vtxo succeeds), or change the INSERT SELECT to compute
swept from swept_marker (join vtxo with swept_marker and derive swept) so the
SELECT no longer references the missing swept column; look for symbols
vtxo_temp, vtxo, swept, swept_marker and vtxo_new when applying the change.
In `@internal/infrastructure/db/sqlite/sqlc/query.sql`:
- Around line 471-484: GetDescendantMarkerIds currently matches parent_markers
via m.parent_markers LIKE '%"' || dm.id || '"%' which is brittle (false
positives for '%'/'_' and overlapping prefixes) and forces full scans; replace
the LIKE with a JSON-aware check using SQLite's json_each (e.g., JOIN/EXISTS
over json_each(m.parent_markers) j WHERE j.value = dm.id) or, better, migrate
parent_markers to a normalized join table (parent_marker mapping) and update
descendant_markers to join that table; also add an integration test for
GetDescendantMarkerIds using marker IDs containing characters like '%'/'_' and
overlapping prefixes to ensure correctness, and document the current limitation
of the LIKE approach in the schema/query comments.
🧹 Nitpick comments (20)
internal/core/application/sweeper_test.go (2)
734-735: Non-obvious Txid values fori >= 26.
string(rune('A'+i))foriin0..49produces ASCII letters A–Z fori < 26, but non-letter characters ([,\,], …) fori >= 26. This doesn't break the test (uniqueness is preserved), butfmt.Sprintf("child-%d", i)would be clearer and consistent withTestCreateCheckpointSweepTask_LargeMarkerSet(line 1189).Suggested fix
- childOutpoints[i] = domain.Outpoint{Txid: "child" + string(rune('A'+i)), VOut: 0} + childOutpoints[i] = domain.Outpoint{Txid: fmt.Sprintf("child-%d", i), VOut: 0}
22-158: Consider generating mocks to reduce boilerplate.~400 lines of hand-rolled mocks for
WalletService,VtxoRepository,MarkerRepository, andTxBuilder. Most methods are stubs returning zero values. Using a tool likemockeryorcounterfeiterwould auto-generate these, reduce maintenance burden as interfaces evolve, and keep the test file focused on test logic.internal/core/domain/marker_repo.go (1)
41-47: VTXO retrieval methods on MarkerRepository blur the boundary with VtxoRepository.
GetVtxosByDepthRange,GetVtxosByArkTxid, andGetVtxoChainByMarkersreturn[]Vtxoand are essentially VTXO queries. Placing them here is understandable since they're marker/depth-optimized, but it means callers now need to know which repository to ask for VTXOs depending on the query pattern. If the interface continues to grow, consider whether a dedicated chain-traversal service or moving these toVtxoRepositorywith marker-aware implementations would keep the boundaries cleaner.internal/infrastructure/db/sqlite/sqlc/query.sql (1)
261-270: Liquidity queries now scan every vtxo row with a correlated LIKE subquery.
SelectExpiringLiquidityAmountandSelectRecoverableLiquidityAmountboth useEXISTS (SELECT 1 FROM swept_marker sm WHERE v.markers LIKE '%"' || sm.marker_id || '"%'). This is essentially a cross join betweenvtxoandswept_markerwith a LIKE predicate on every pair —O(vtxos × swept_markers)string scans per query. As the number of swept markers grows, these queries will degrade.Consider caching swept status on the vtxo row itself (a denormalized
sweptflag updated during BulkSweepMarkers), or evaluating sweep status in the application layer where the marker set is already available.Also applies to: 273-279
internal/infrastructure/db/postgres/sqlc/query.sql (1)
500-514: Inconsistent projection:SELECT *vssqlc.embed(vtxo_vw)across vtxo queries.
SelectVtxosByDepthRange,SelectVtxosByArkTxid, andSelectVtxoChainByMarkeruseSELECT * FROM vtxo_vw, while all other vtxo queries (e.g.,SelectAllVtxos,SelectVtxo,SelectSweepableUnrolledVtxos) useSELECT sqlc.embed(vtxo_vw) FROM vtxo_vw. This generates different Go return types — flat structs vs. nestedstruct { VtxoVw VtxoVw }— requiring different mapping code in the repository layer.Consider using
sqlc.embed(vtxo_vw)consistently so the generated Go types are uniform.Suggested fix
-- name: SelectVtxosByDepthRange :many -- Get all VTXOs within a depth range, useful for filling gaps between markers -SELECT * FROM vtxo_vw +SELECT sqlc.embed(vtxo_vw) FROM vtxo_vw WHERE depth >= `@min_depth` AND depth <= `@max_depth` ORDER BY depth DESC; -- name: SelectVtxosByArkTxid :many -- Get all VTXOs created by a specific ark tx (offchain tx) -SELECT * FROM vtxo_vw WHERE txid = `@ark_txid`; +SELECT sqlc.embed(vtxo_vw) FROM vtxo_vw WHERE txid = `@ark_txid`; -- name: SelectVtxoChainByMarker :many -- Get VTXOs whose markers JSONB array contains any of the given marker IDs -SELECT * FROM vtxo_vw +SELECT sqlc.embed(vtxo_vw) FROM vtxo_vw WHERE markers ?| `@marker_ids`::TEXT[] ORDER BY depth DESC;internal/infrastructure/db/sqlite/vtxo_repo.go (1)
538-548: Silent error swallowing inparseMarkersJSONFromVtxocould mask data corruption.If the JSON in the
markerscolumn is malformed, this function silently returnsnilwithout any logging. While defensive, this could make it hard to diagnose data integrity issues.Consider adding a log warning on parse failure, consistent with how other parse errors are handled elsewhere in the codebase.
Optional: add warning log on parse failure
func parseMarkersJSONFromVtxo(markersJSON string) []string { if markersJSON == "" { return nil } var markerIDs []string if err := json.Unmarshal([]byte(markersJSON), &markerIDs); err != nil { + // Log warning to help diagnose data corruption + log.WithError(err).Warn("failed to parse markers JSON from vtxo") return nil } return markerIDs }internal/core/application/indexer_test.go (1)
15-299: Consider using a mock generation tool to reduce boilerplate.The manual mock implementations (~280 lines of stubs) are correct but add significant maintenance burden. Tools like
mockeryormoqcould auto-generate these from the interfaces and keep them in sync as the repository interfaces evolve.That said, the explicit nil-interface handling in
Markers()(lines 288–294) is a valuable pattern worth keeping regardless.internal/infrastructure/db/badger/marker_repo.go (1)
524-528: Dead error handling — assign-then-discard pattern is misleading.Line 525 calls
r.SweepMarker(...)and assigns toerr, then line 528 discards it with_ = err. This is confusing — use_ =directly on the call.Simplify the error discard
- if err := r.SweepMarker(ctx, markerID, time.Now().Unix()); err != nil { - // Non-fatal - the vtxos are already marked as swept - _ = err - } + // Non-fatal - the vtxos are already marked as swept + _ = r.SweepMarker(ctx, markerID, time.Now().Unix())internal/infrastructure/db/sqlite/marker_repo.go (2)
398-508: Four nearly identicalrowToVtxoFrom*functions — consider a shared mapper.
rowToVtxoFromMarkerQuery,rowToVtxoFromDepthRangeQuery,rowToVtxoFromArkTxidQuery, androwToVtxoFromChainQueryall perform the same mapping fromVtxoVwembedded in different sqlc row types. Since the innerrow.VtxoVwis the same type (queries.VtxoVw), you could extract a sharedvtxoVwToDomain(vw queries.VtxoVw) domain.Vtxoand call it from each wrapper, reducing ~100 lines of duplication.Note that
vtxo_repo.goalready hasrowToVtxo(row queries.VtxoVw)which does essentially the same mapping — you could reuse that directly.Consolidate using the existing rowToVtxo from vtxo_repo.go
func rowToVtxoFromMarkerQuery(row queries.SelectVtxosByMarkerIdRow) domain.Vtxo { - var commitmentTxids []string - if commitments, ok := row.VtxoVw.Commitments.(string); ok && commitments != "" { - commitmentTxids = strings.Split(commitments, ",") - } - return domain.Vtxo{ - Outpoint: domain.Outpoint{ - Txid: row.VtxoVw.Txid, - VOut: uint32(row.VtxoVw.Vout), - }, - // ... all fields ... - } + return rowToVtxo(row.VtxoVw) }Apply the same pattern to all four functions.
510-519: DuplicateparseMarkersJSON— already exists asparseMarkersJSONFromVtxoinvtxo_repo.go.Both functions in this package have identical logic. Consolidate into a single shared function.
internal/core/application/indexer.go (2)
416-431: Consider batchingGetMarkercalls to reduce DB round-trips during BFS.Each iteration of the BFS loop issues an individual
GetMarkerDB call (line 420). For deep marker chains (e.g., depth 20000 with markers every 100 levels = ~200 markers), this results in ~200 sequential queries. A batch approach usingGetMarkersByIdson the current queue batch would be significantly faster.♻️ Sketch of batched BFS
for len(queue) > 0 { - currentID := queue[0] - queue = queue[1:] - - marker, err := i.repoManager.Markers().GetMarker(ctx, currentID) - if err != nil || marker == nil { + // Fetch all markers in current queue batch at once + batch := queue + queue = nil + markers, err := i.repoManager.Markers().GetMarkersByIds(ctx, batch) + if err != nil { continue } - - for _, parentID := range marker.ParentMarkerIDs { - if !visited[parentID] { - visited[parentID] = true - markerIDs = append(markerIDs, parentID) - queue = append(queue, parentID) + for _, marker := range markers { + for _, parentID := range marker.ParentMarkerIDs { + if !visited[parentID] { + visited[parentID] = true + markerIDs = append(markerIDs, parentID) + queue = append(queue, parentID) + } } } }
465-474: Cache is mutated via thecachemap parameter — document this side effect.
getVtxosFromCacheOrDBupdates the caller's map in-place (line 473). This is correct for the current usage pattern, but the mutation is non-obvious. A brief doc note on the side effect would improve maintainability.internal/infrastructure/db/postgres/migration/20260210100000_add_depth_and_markers.up.sql (2)
73-83: Correlated EXISTS subquery in the view may degrade asswept_markergrows.The
vtxo_vwview computessweptviaEXISTS (SELECT 1 FROM swept_marker sm WHERE v.markers @> jsonb_build_array(sm.marker_id)). This scansswept_markerfor each VTXO row. While the GIN index onmarkershelps the containment check, this is effectively a semi-join where the outer side (swept_marker) is iterated per-vtxo. As the number of swept markers grows, this scan may become expensive for queries that touch many VTXOs.Consider whether a reverse lookup (joining vtxo markers against
swept_markerPK) or a materialized approach would scale better for your expected data volumes.
25-30: Remove intermediate view creation — they are dropped and recreated without ever being used.The views created at lines 25-30 and 32-40 are dropped at lines 65-66 before being recreated at lines 73-93. The backfill queries (lines 44-62) query
vtxodirectly, so these intermediate views are never referenced and can be removed to simplify the migration.♻️ Simplified migration flow
-- Drop views before dropping the swept column (views depend on it via v.*) -DROP VIEW IF EXISTS intent_with_inputs_vw; -DROP VIEW IF EXISTS vtxo_vw; - -CREATE VIEW vtxo_vw AS -SELECT v.*, string_agg(vc.commitment_txid, ',') AS commitments -FROM vtxo v -LEFT JOIN vtxo_commitment_txid vc -ON v.txid = vc.vtxo_txid AND v.vout = vc.vtxo_vout -GROUP BY v.txid, v.vout; - -CREATE VIEW intent_with_inputs_vw AS -SELECT vtxo_vw.*, - intent.id, - intent.round_id, - intent.proof, - intent.message -FROM intent -LEFT OUTER JOIN vtxo_vw -ON intent.id = vtxo_vw.intent_id; - -- Backfill: Create a marker for every existing VTXO using its outpoint as marker IDinternal/infrastructure/db/sqlite/migration/20260210000000_add_depth_and_markers.up.sql (1)
22-41: Intermediate view recreation appears unused — backfill queries referencevtxodirectly.Same as the Postgres migration: the views created at lines 26-41 are dropped again at lines 96-97 without being referenced by the backfill statements (lines 45-62). They add migration complexity without benefit.
internal/infrastructure/db/postgres/marker_repo.go (2)
249-274: TOCTOU between count query and sweep insert inSweepVtxosByMarker.
CountUnsweptVtxosByMarkerId(line 260) andInsertSweptMarker(line 266) are not atomic. The returned count may not reflect the actual number of VTXOs affected by the sweep. Since the count is only used for logging/metrics, this isn't a correctness issue, but worth noting.
426-436: Silent error swallowing inparseMarkersJSONB— consider logging.Unmarshal errors at line 432 are silently swallowed. If corrupted marker JSON ends up in the database, this would silently produce nil marker IDs, making affected VTXOs invisible to marker-based queries. A debug-level log would aid troubleshooting without adding noise.
internal/core/application/service_test.go (1)
562-567:outputs[0].MarkerIDsis re-sorted on every loop iteration.The sort at line 563 mutates
outputs[0].MarkerIDsin-place on each iteration. Move it before the loop.♻️ Minor optimization
// All outputs must have the same marker IDs + sort.Strings(outputs[0].MarkerIDs) for i := 1; i < len(outputs); i++ { - sort.Strings(outputs[0].MarkerIDs) sort.Strings(outputs[i].MarkerIDs) require.Equal(t, outputs[0].MarkerIDs, outputs[i].MarkerIDs, "output %d has different markers than output 0", i) }internal/infrastructure/db/badger/vtxo_repo.go (2)
23-26: Duplicate accessors:GetStore()andStore()return the same value.Both methods on Lines 23-26 and Lines 421-424 return
r.storewith identical signatures. Pick one and remove the other to avoid confusion about which to call.Also applies to: 421-424
625-637: Redundant visited check inGetSweepableVtxosByCommitmentTxid.Line 627 checks
!visited[outpointKey], and Line 628 checks!seenon the same key. Sincevisitedmaps tobool,!visited[key]is true iff the key is absent (zero-valuefalse), making the inner check always true when reached. This also means Line 633-635 (enqueueArkTxid) is unreachable for already-visited outpoints — which is correct — but the double-check is confusing. Compare with the cleaner pattern inGetAllChildrenVtxos(Lines 668-676).Simplify to match GetAllChildrenVtxos pattern
for _, vtxo := range vtxos { outpointKey := vtxo.Outpoint.String() - if !visited[outpointKey] { - if _, seen := visited[outpointKey]; !seen { - visited[outpointKey] = true - outpoints = append(outpoints, vtxo.Outpoint) - } - - if vtxo.ArkTxid != "" { - queue = append(queue, vtxo.ArkTxid) - } + if !visited[outpointKey] { + visited[outpointKey] = true + outpoints = append(outpoints, vtxo.Outpoint) + if vtxo.ArkTxid != "" { + queue = append(queue, vtxo.ArkTxid) + } } }
# Conflicts: # internal/interface/grpc/handlers/parser_test.go
|
👋 @bitcoin-coder-bob — this PR has been open for 39 days with outstanding change requests from @louisinger. Last push was over 30 days ago. Is this still being worked on, or is it blocked on something? Let us know if we can help unblock. |
|
🔍 Arkana PR Review (incremental — new commits since last review) Changes since last review: Added comprehensive benchmark suite ( Assessment: ✅ Excellent test additions Benchmarks (
New unit tests (
The No security concerns in this increment — all changes are test/bench code. The core DAG traversal and marker logic were reviewed in the previous pass. |
|
Hey @bitcoin-coder-bob — @louisinger requested changes on Mar 16. Any update on this? Has the feedback been addressed in recent commits? |
|
Iterative review — new commit This commit addresses a real race condition: the DB projection updates asynchronously, so The fix:
Review notes:
Good fix for a subtle but important race. 👍 |
|
Changes were requested 6+ days ago. @bitcoin-coder-bob need any help addressing the feedback? |
) * bulk-fetch offchain txs in walkVtxoChain to reduce DB round-trips * Add benchmark and test for bulk offchain tx fetch improvement * index idx_checkpoint_tx_offchain_txid * testing our performance times for GetOffchainTxsByTxids * chunk SQLite bulk offchain tx fetch + multi-txid test * lint * preload offchain txs via marker DAG in walkVtxoChain, nbxplorer version bump * offchainTxRepo check, may be nil in test helpers * move walkVtxoChain timing out of prod code into in-process test * clarify simulated latency in timing breakdown and panic on unwired repo accessors
There was a problem hiding this comment.
🔍 Arkana PR Review — arkd#908 (incremental)
New commit since last review: 69879d04 — bulk-fetch offchain txs in walkVtxoChain (#1005). Plus 40ff9f82 (merge master, no logic).
This commit eliminates the remaining O(N) bottleneck I flagged in previous reviews: per-hop GetOffchainTx calls during chain traversal. Now offchain txs are bulk-loaded alongside VTXOs during the marker preload phase, and lazily bulk-fetched per-iteration for cache misses. Good.
Findings
1. 🟡 Badger GetOffchainTxsByTxids — fragile error matching
internal/infrastructure/db/badger/ark_repo.go — the new method uses strings.Contains(err.Error(), "not found") to skip missing txids. This is brittle: if the error message in getOffchainTx ever changes (or a wrapped error incidentally contains "not found"), behavior silently changes. Use errors.Is with a sentinel error, or at minimum errors.As with a typed error. The underlying getOffchainTx already checks badgerhold.ErrNotFound — expose that check properly rather than string-matching on the formatted message.
2. ✅ Postgres/SQLite bulk queries — safe
- Postgres:
txid = ANY($1::varchar[])withpq.Array(txids)— parameterized, no injection risk. - SQLite:
sqliteMaxBulkTxids = 500correctly stays underSQLITE_MAX_VARIABLE_NUMBER. Batching loop is clean. - Both filter
COALESCE(fail_reason, '') = ''consistently with the single-fetch query. Good.
3. ✅ New index migration
20260409140000_checkpoint_tx_offchain_txid_index — adds idx_checkpoint_tx_offchain_txid on checkpoint_tx(offchain_txid). This supports the JOIN inside offchain_tx_vw when bulk-fetching. Correct optimization — the view was already doing this join, just without the index to back the multi-row case.
4. ✅ Preload error semantics
preloadByMarkers hard-fails if GetOffchainTxsByTxids errors, while marker window errors log-and-continue. This asymmetry is actually correct: marker cache misses degrade to per-VTXO lookups (functional), but offchain tx failures would produce incomplete chain data (incorrect). Right call.
5. ✅ Cache key consistency
Offchain tx cache uses tx.ArkTxid as key, lookups use vtxo.Txid. These are the same field semantically (the offchain tx's ark txid is the VTXO's parent txid). Verified in domain model.
6. ✅ Fallback path
When bulk fetch returns empty or misses a txid, the walk loop falls back to individual GetOffchainTx and populates the cache (offchainTxCache[vtxo.Txid] = offchainTx). Correct — handles both pre-optimization repos and race conditions where a tx is inserted after the preload.
7. ✅ Test coverage
TestBulkOffchainTxReducesDBCalls— fanout tree, proves bulk calls = O(depth) vs individual calls = O(N). Correct assertion.TestVtxoChainTimingBreakdown— 10k-chain wall-clock breakdown with simulated latency. Useful diagnostic.BenchmarkOffchainTxBulkVsSingle— direct comparison with simulated 50µs latency. Good.preconfirmed chain bulk-loads offchain txs+falls back to single fetch on cache miss— unit tests for both the happy and fallback paths.- DB integration test in
service_test.go— multi-txid fetch with cross-contamination guard. Thorough.
8. Nit: docker-compose.regtest.yml bumps nbxplorer 2.5.30 → 2.5.30-1. Unrelated — should probably be a separate commit.
9. Security: No impact. This is purely a read-path optimization. No changes to exit paths, forfeit validation, transaction signing, or sweep correctness.
Summary
Only finding #1 (badger string-matching) is worth addressing — it's a latent bug vector. Everything else is clean. The optimization is well-structured with proper fallback paths and strong test coverage.
|
@arkanaai @coderabbitai review |
|
✅ Actions performedReview triggered.
|
Arkana Review — Scale the DAG (#908)Impressive engineering here. The marker DAG concept is sound and the benchmarks are compelling (20,000 → ~200 DB round-trips for a 10k chain). I went through the implementation carefully. A few issues need attention before merge, ranging from protocol-critical to minor. 🔴 Critical: Sweep over-reach (protocol-critical)File: The sweep path changed from targeting specific VTXOs by outpoint to sweeping entire marker windows: // OLD — precise: sweeps exactly childrenVtxos
_, err = s.repoManager.Vtxos().SweepVtxos(ctx, childrenVtxos)
// NEW — sweeps ALL VTXOs tagged with each marker, regardless of whether they're in childrenVtxos
markerStore.BulkSweepMarkers(ctx, markerIDs, sweptAt)A marker covers a 100-depth window across the entire VTXO DAG — not just the subtree being swept. If VTXO-A and VTXO-B share a marker window but only VTXO-A's descendants are being swept in this checkpoint, sweeping the marker will incorrectly mark VTXO-B's descendants as swept too. This could cause real fund loss: VTXOs incorrectly marked swept won't be claimable by users. Fix needed: Either ensure markers are always subtree-scoped (never shared across independent VTXO subtrees), or track sweep at finer granularity. The migration backfill uses one marker per VTXO for legacy data — does the forward-creation path guarantee that a marker window is always contained within a single redemption subtree? 🔴 Critical: Unsigned pagination cursor enables auth bypassFile: The cursor is plain base64-encoded JSON with no HMAC or signature: func encodeChainCursor(frontier []domain.Outpoint) string {
data, _ := json.Marshal(cur)
return base64.RawURLEncoding.EncodeToString(data)
}Auth validation in Fix: HMAC the cursor with a server-side secret, or validate that each outpoint in a decoded cursor is reachable from the original 🟡 Medium: Silent sweep failure when no markers presentFile: if len(uniqueMarkers) == 0 {
return nil // silent no-op
}If a VTXO has no markers (possible for legacy VTXOs despite the migration backfill, or if marker assignment failed), the sweep silently does nothing. The old code would have swept those VTXOs regardless. A 🟡 Medium:
|
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
internal/core/application/service.go (1)
364-381:⚠️ Potential issue | 🟠 MajorReject partial parent loads before deriving child metadata.
This block assumes
GetVtxosreturned every parent. If the repository only returns a subset, the emittednewVtxosget a shallowerDepthand may miss inheritedSweptstate. Mirror the completeness check already used inSubmitOffchainTxand bail out on partial reads.🛠️ Suggested guard
spentVtxos, err := s.repoManager.Vtxos().GetVtxos( context.Background(), spentVtxoKeys, ) if err != nil { log.WithError(err).Warn("failed to get spent vtxos") return } + if len(spentVtxos) != len(spentVtxoKeys) { + log.WithFields(log.Fields{ + "expected": len(spentVtxoKeys), + "got": len(spentVtxos), + "txid": txid, + }).Warn("failed to load all spent vtxos") + return + } // Calculate depth for new vtxos: max(parent depths) + 1 var maxDepth uint32🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/application/service.go` around lines 364 - 381, The code assumes repoManager.Vtxos().GetVtxos returned all parents but doesn't verify completeness: before computing maxDepth and setting newVtxos[].Depth, check that the returned spentVtxos length matches the requested spentVtxoKeys and bail out (return early with a warning/error) on partial reads, similar to the completeness guard used in SubmitOffchainTx; use the same pattern around GetVtxos / spentVtxoKeys / spentVtxos to detect missing parents and avoid deriving incorrect Depth or Swept state for newVtxos.internal/core/application/indexer.go (1)
708-716:⚠️ Potential issue | 🟠 MajorWhitelist checkpoint txids when building chain auth tokens.
createAuthToken(allOutpoints)is what later authorizesGetVirtualTxs, but this branch never adds checkpoint txids toallOutpoints. That means private/withheld clients can receive a checkpoint txid fromGetVtxoChainand then get rejected when they try to fetch the corresponding virtual tx.Suggested fix
txid := ptx.UnsignedTx.TxID() checkpointTxs = append(checkpointTxs, ChainTx{ Txid: txid, ExpiresAt: vtxo.ExpiresAt, Type: IndexerChainedTxTypeCheckpoint, Spends: []string{ptx.UnsignedTx.TxIn[0].PreviousOutPoint.String()}, }) + allOutpoints = append(allOutpoints, Outpoint{ + Txid: txid, + VOut: 0, + }) chainTx.Spends = append(chainTx.Spends, txid)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/application/indexer.go` around lines 708 - 716, This branch builds checkpoint ChainTxs (see checkpointTxs append and chainTx.Spends update) but never includes checkpoint txids in the allOutpoints set used by createAuthToken(allOutpoints), so clients can be authorized to see chain entries but then rejected when calling GetVirtualTxs; update the same branch that constructs checkpointTxs (where txid := ptx.UnsignedTx.TxID()) to also add that txid to the allOutpoints collection (or a dedicated whitelist passed into createAuthToken) so createAuthToken(allOutpoints) includes checkpoint txids and authorizes GetVirtualTxs for those checkpoint entries.internal/infrastructure/db/service.go (2)
507-548:⚠️ Potential issue | 🟠 MajorStop after retry exhaustion instead of continuing with partial projection state.
If any of these loops exhausts all 5 attempts, the handler falls through as if the projection update succeeded. That can leave the round partially projected, and
CreateRootMarkersForVtxoscan still run afterAddVtxosnever persisted anything.🛠️ Make retry success explicit
- for attempt := range maxProjectionRetry { + success := false + for attempt := 0; attempt < maxProjectionRetry; attempt++ { if err := repo.SettleVtxos(ctx, spentVtxos, round.CommitmentTxid); err != nil { log.WithError(err).Warnf( "failed to spend vtxos (attempt %d/%d)", attempt+1, maxProjectionRetry, ) time.Sleep(100 * time.Millisecond) continue } log.Debugf("spent %d vtxos", len(spentVtxos)) + success = true break } + if !success { + log.WithField("round_id", round.Id).Error("failed to settle vtxos after retries") + return + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infrastructure/db/service.go` around lines 507 - 548, The retry loops for repo.SettleVtxos, repo.AddVtxos, and s.markerStore.CreateRootMarkersForVtxos currently swallow failures after maxProjectionRetry and allow subsequent steps to run, leading to partial projection state; change each loop (the one iterating over maxProjectionRetry for spentVtxos and the two loops for newVtxos) to explicitly detect exhaustion and return or propagate an error (using ctx/round.CommitmentTxid for context) instead of breaking silently so upstream handler stops processing when SettleVtxos or AddVtxos or CreateRootMarkersForVtxos ultimately fail; reference the functions repo.SettleVtxos, repo.AddVtxos and s.markerStore.CreateRootMarkersForVtxos and the maxProjectionRetry/sleep logic to implement an explicit failure path after retries.
590-729:⚠️ Potential issue | 🔴 CriticalAlready-swept finalized outputs lose their sweep state in the marker model.
txSwept/outputSweptonly setsdomain.Vtxo.Swept, but sweep status is now derived from marker membership. On non-boundary depthsdomain.NewMarker(...)can returnnil, so these VTXOs just inheritparentMarkerIDs; nothing here sweeps those markers unless the output is dust. That means expired/already-swept finalized outputs are persisted as active again.The dust path has the same problem on failures: it still appends
dustMarkerIDafterAddMarkerfails, andBulkSweepMarkersonly warns on error. In both cases you can end up with VTXOs that should be swept but are not representable as swept.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infrastructure/db/service.go` around lines 590 - 729, The code fails to persist sweep state into the marker model: when NewMarker returns nil (inherited parentMarkerIDs) or when outputs are expired/swept (txSwept/outputSwept), you must explicitly mark the involved markers as swept instead of only setting domain.Vtxo.Swept; call s.markerStore.BulkSweepMarkers(ctx, markerIDs, sweptAt) (or for per-vtxo dust use vtxoMarkerIDs) before persisting vtxos whenever txSwept or outputSwept is true; also only append dustMarkerID to createdDustMarkerIDs and to vtxoMarkerIDs after s.markerStore.AddMarker successfully returns (do not append on AddMarker error), and treat BulkSweepMarkers failures as a hard error (return) or retry so you don’t persist vtxos whose swept state isn’t recorded (adjust control flow around s.vtxoStore.AddVtxos and the existing BulkSweepMarkers call accordingly).
🧹 Nitpick comments (4)
internal/infrastructure/db/sqlite/sqlc/query.sql (2)
495-504: LIKE-based marker matching is safe but not indexable.The comment correctly notes that LIKE is used because sqlc cannot parse
json_eachwith view columns, and that txid:vout format IDs are safe. However, queries usingLIKE '%"' ||@marker_id|| '"%'cannot leverage indexes on themarkerscolumn. For very large VTXO tables, consider documenting this limitation in the query comments or adding a future optimization note.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infrastructure/db/sqlite/sqlc/query.sql` around lines 495 - 504, The LIKE-based marker matching in the SelectVtxosByMarkerId and CountUnsweptVtxosByMarkerId queries on view vtxo_vw (using markers LIKE '%"' || `@marker_id` || '"%') is not indexable and may perform poorly on large VTXO tables; update the SQL comment for both query definitions to explicitly document this limitation and add a short "Future optimization" note suggesting migrating to a JSON-aware index/search (or rewriting to use json_each on the underlying table) when sqlc/view support permits so future maintainers know why LIKE is used and how to improve it.
262-281: Performance concern withjson_eachin liquidity queries.Both
SelectExpiringLiquidityAmountandSelectRecoverableLiquidityAmountusejson_each(v.markers)in correlated subqueries. For large VTXO tables, this will parse JSON for every row. Consider adding a comment documenting expected performance characteristics, or evaluate adding a GIN index on the markers column if query times become problematic at scale.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infrastructure/db/sqlite/sqlc/query.sql` around lines 262 - 281, Both SelectExpiringLiquidityAmount and SelectRecoverableLiquidityAmount repeatedly call json_each(v.markers) which forces JSON parsing per vtxo row and will degrade at scale; add a short comment above these queries documenting the performance risk and expected data sizes, and either (a) denormalize markers into a dedicated vtxo_marker table and join on sm.marker_id with an index for fast lookup, or (b) if your SQLite build supports a JSON-indexing extension, add an appropriate index on the markers column; reference v.markers, swept_marker.sm.marker_id, and the two query names when applying the change so reviewers can find and validate the optimization.internal/infrastructure/db/sqlite/vtxo_repo.go (1)
573-583: Silent error suppression inparseMarkersJSONFromVtxomay hide data corruption.The function returns
nilon unmarshal errors without logging or surfacing the issue. If themarkersJSON becomes malformed (e.g., due to a bug or data corruption), this would silently degrade functionality without any indication.Consider logging a warning when unmarshal fails:
♻️ Proposed fix to add error logging
func parseMarkersJSONFromVtxo(markersJSON string) []string { if markersJSON == "" { return nil } var markerIDs []string if err := json.Unmarshal([]byte(markersJSON), &markerIDs); err != nil { + // Log but don't fail - treat malformed markers as empty + // This shouldn't happen in normal operation + log.Warnf("failed to parse markers JSON: %v", err) return nil } return markerIDs }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infrastructure/db/sqlite/vtxo_repo.go` around lines 573 - 583, The parseMarkersJSONFromVtxo function currently swallows json.Unmarshal errors; change it to log a warning when unmarshal fails so data corruption is visible: in parseMarkersJSONFromVtxo, when json.Unmarshal returns an error, call the project's logging facility (e.g., log.Printf or the repository logger) to emit a warning that includes the error and a truncated/sanitized markersJSON value (to aid debugging without leaking sensitive data), then continue to return nil (or an empty slice if preferred by callers). Ensure the log message references parseMarkersJSONFromVtxo and includes the error and the markersJSON snippet.internal/core/application/indexer_test.go (1)
374-377: The catch-all bulk-fetch mock masks the bulk preload optimization.The
On("GetOffchainTxsByTxids", mock.Anything, mock.Anything).Return([]*domain.OffchainTx{}, nil).Maybe()expectation is registered first with unlimited repeatability (from.Maybe()) and matches any arguments viamock.Anything. In testify/mock, expectations are evaluated in declaration order and the first match wins. Since this catch-all can match indefinitely, all subsequentGetOffchainTxsByTxidscalls—regardless of arguments—will match this expectation first, and the specific per-test expectations are never reached.This means tests set up specific bulk-fetch behaviors but never actually exercise them. The code still works because empty results trigger fallback to
GetOffchainTx, but the optimization path is never verified. A regression in the bulk-fetch logic would go undetected.Consider registering the catch-all with limited repeatability or moving it after specific test expectations so tests can verify the bulk path is actually used.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/application/indexer_test.go` around lines 374 - 377, The catch-all mock for GetOffchainTxsByTxids uses mock.Anything plus .Maybe(), which matches first and prevents per-test expectations from being exercised; change the registration so it doesn't override specific expectations by either removing .Maybe() and setting a limited repeat count (e.g., Once/Times) or moving this generic On("GetOffchainTxsByTxids", mock.Anything, mock.Anything) call to after tests register their specific expectations, ensuring the per-test mocks for GetOffchainTxsByTxids are matched and the bulk preload optimization is actually exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/core/application/indexer_bench_test.go`:
- Around line 621-630: The benchmark zeroes the repo counters each iteration
because repo.reset() is called inside the for loop, so b.ReportMetric reads only
the final-iteration counts; move repo.reset() to just before the for i := 0; i <
b.N; i++ loop (so repo.bulkCalls and repo.singleCalls accumulate across all
iterations) and then keep the existing
b.ReportMetric(float64(repo.bulkCalls.Load())/float64(b.N), ...) and singleCalls
reporting after b.StopTimer(); apply the same change for the other similar
benchmark block (the one around lines 640-649) to ensure metrics represent
calls/op across the whole run rather than the last iteration.
In `@internal/core/application/indexer.go`:
- Around line 356-357: The auth check currently only verifies vtxoKey via
validateChainAuth(authToken, vtxoKey, ...), allowing a client to supply a forged
decoded pageToken whose frontier contains other outpoints; update the code in
walkVtxoChain and the pageToken handling paths (where pageToken is decoded and
replaces vtxoKey) to validate every outpoint in the decoded frontier against the
token whitelist by calling validateChainAuth(authToken, frontierOutpoint, true)
(or a new helper that accepts an outpoint) for each frontier entry before using
it, or alternatively compute and verify an HMAC/signature over the serialized
cursor when generating and reading pageToken so cursors cannot be tampered with;
ensure the same change is applied to the other affected call sites referenced
(the continuation branch and the other pageToken usage blocks) so decoded
frontiers are never trusted without per-outpoint validation or cryptographic
binding to the auth token.
In `@internal/infrastructure/db/badger/marker_repo.go`:
- Around line 258-266: The current code does a best-effort update of VTXO DTOs
(using vtxoStore.Find -> vtxoStore.Update on filteredDtos / vtxoDTO) after
inserting swept_marker, but readers still return the stale dto.Swept value if
those updates fail; fix by either making the VTXO updates part of the write
contract (perform the vtxoStore.Update calls inside the same transactional
operation and propagate/return errors from the marker insert/update so failures
abort) or stop storing Swept in the DTO and compute it dynamically on read
(modify the Badger getters that read dto.Vtxo/dto.Swept to consult
sweptMarkerStore for marker existence and set Swept accordingly), and apply the
same change for the other similar block at 269-284.
In `@internal/infrastructure/db/postgres/marker_repo.go`:
- Around line 442-451: parseMarkersJSONB currently returns nil on JSON unmarshal
errors which silently treats corrupted marker payloads as "no markers"; change
parseMarkersJSONB to return ([]string, error), return an explicit error when
json.Unmarshal fails (and return an empty slice for empty input if desired), and
update all call sites that call parseMarkersJSONB to handle the error (propagate
or fail fast) instead of assuming a nil/empty slice; search for
parseMarkersJSONB references to update their signatures and error handling
accordingly.
- Around line 233-246: The UpdateVtxoMarkers function can marshal a nil
markerIDs slice to JSON null which breaks JSON-array membership checks; modify
markerRepository.UpdateVtxoMarkers so that before marshalling you normalize nil
marker slices to an empty slice (e.g. if markerIDs == nil set markerIDs =
[]string{} or marshal an explicit empty slice) and then marshal to JSON and pass
that byte array into queries.UpdateVtxoMarkersParams.Markers so the DB always
stores "[]" for no markers instead of "null".
In
`@internal/infrastructure/db/postgres/migration/20260219082956_fix_vtxo_vw_perf.up.sql`:
- Around line 23-26: The current EXISTS subquery scans swept_marker for each v
row; replace it by unnesting v.markers (e.g. using
jsonb_array_elements_text(v.markers) or a LATERAL join) and directly joining the
resulting marker values to swept_marker.marker_id so the planner can use the
swept_marker index; update the view to compute swept as a boolean from that join
(e.g. existence via JOIN + DISTINCT or boolean aggregate like bool_or)
referencing v.markers, swept_marker.marker_id and the swept_marker table instead
of the current EXISTS(...) pattern.
In `@internal/infrastructure/db/postgres/sqlc/queries/query.sql.go`:
- Around line 118-131: The recursive CTE getDescendantMarkerIds can emit
duplicates because it uses UNION ALL; change the CTE to use set semantics
(replace UNION ALL with UNION) or deduplicate at the end (use SELECT DISTINCT on
descendant_markers.id) so each marker id appears only once; update the CTE named
descendant_markers / query named GetDescendantMarkerIds accordingly to prevent
duplicate marker rows from converging DAG paths.
- Around line 106-115: The query in the constant countUnsweptVtxosByMarkerId
(used by Queries.CountUnsweptVtxosByMarkerId) uses COUNT(*) which double-counts
VTXOs when vtxo_vw expands assets; change the SQL to count distinct outpoints
instead (e.g. SELECT COUNT(DISTINCT outpoint) FROM vtxo_vw ...), or if the view
does not expose a single outpoint column, use COUNT(DISTINCT (txid, vout)) or
equivalent to deduplicate by transaction id + output index, then keep the rest
of the function (QueryRowContext + Scan into int64) unchanged.
In `@internal/infrastructure/db/postgres/sqlc/query.sql`:
- Around line 475-488: GetDescendantMarkerIds' recursive CTE can cycle; modify
the CTE (descendant_markers) to carry a visited set/path (e.g., an array or
jsonb of marker ids) and only recurse into rows whose id is not already in that
visited set, appending the new id to visited on each recursion; keep the same
final SELECT filtering out ids in swept_marker and preserve the base case
(SELECT marker.id) but initialize visited with that id so cycles like A->B->A
are prevented.
---
Outside diff comments:
In `@internal/core/application/indexer.go`:
- Around line 708-716: This branch builds checkpoint ChainTxs (see checkpointTxs
append and chainTx.Spends update) but never includes checkpoint txids in the
allOutpoints set used by createAuthToken(allOutpoints), so clients can be
authorized to see chain entries but then rejected when calling GetVirtualTxs;
update the same branch that constructs checkpointTxs (where txid :=
ptx.UnsignedTx.TxID()) to also add that txid to the allOutpoints collection (or
a dedicated whitelist passed into createAuthToken) so
createAuthToken(allOutpoints) includes checkpoint txids and authorizes
GetVirtualTxs for those checkpoint entries.
In `@internal/core/application/service.go`:
- Around line 364-381: The code assumes repoManager.Vtxos().GetVtxos returned
all parents but doesn't verify completeness: before computing maxDepth and
setting newVtxos[].Depth, check that the returned spentVtxos length matches the
requested spentVtxoKeys and bail out (return early with a warning/error) on
partial reads, similar to the completeness guard used in SubmitOffchainTx; use
the same pattern around GetVtxos / spentVtxoKeys / spentVtxos to detect missing
parents and avoid deriving incorrect Depth or Swept state for newVtxos.
In `@internal/infrastructure/db/service.go`:
- Around line 507-548: The retry loops for repo.SettleVtxos, repo.AddVtxos, and
s.markerStore.CreateRootMarkersForVtxos currently swallow failures after
maxProjectionRetry and allow subsequent steps to run, leading to partial
projection state; change each loop (the one iterating over maxProjectionRetry
for spentVtxos and the two loops for newVtxos) to explicitly detect exhaustion
and return or propagate an error (using ctx/round.CommitmentTxid for context)
instead of breaking silently so upstream handler stops processing when
SettleVtxos or AddVtxos or CreateRootMarkersForVtxos ultimately fail; reference
the functions repo.SettleVtxos, repo.AddVtxos and
s.markerStore.CreateRootMarkersForVtxos and the maxProjectionRetry/sleep logic
to implement an explicit failure path after retries.
- Around line 590-729: The code fails to persist sweep state into the marker
model: when NewMarker returns nil (inherited parentMarkerIDs) or when outputs
are expired/swept (txSwept/outputSwept), you must explicitly mark the involved
markers as swept instead of only setting domain.Vtxo.Swept; call
s.markerStore.BulkSweepMarkers(ctx, markerIDs, sweptAt) (or for per-vtxo dust
use vtxoMarkerIDs) before persisting vtxos whenever txSwept or outputSwept is
true; also only append dustMarkerID to createdDustMarkerIDs and to vtxoMarkerIDs
after s.markerStore.AddMarker successfully returns (do not append on AddMarker
error), and treat BulkSweepMarkers failures as a hard error (return) or retry so
you don’t persist vtxos whose swept state isn’t recorded (adjust control flow
around s.vtxoStore.AddVtxos and the existing BulkSweepMarkers call accordingly).
---
Nitpick comments:
In `@internal/core/application/indexer_test.go`:
- Around line 374-377: The catch-all mock for GetOffchainTxsByTxids uses
mock.Anything plus .Maybe(), which matches first and prevents per-test
expectations from being exercised; change the registration so it doesn't
override specific expectations by either removing .Maybe() and setting a limited
repeat count (e.g., Once/Times) or moving this generic
On("GetOffchainTxsByTxids", mock.Anything, mock.Anything) call to after tests
register their specific expectations, ensuring the per-test mocks for
GetOffchainTxsByTxids are matched and the bulk preload optimization is actually
exercised.
In `@internal/infrastructure/db/sqlite/sqlc/query.sql`:
- Around line 495-504: The LIKE-based marker matching in the
SelectVtxosByMarkerId and CountUnsweptVtxosByMarkerId queries on view vtxo_vw
(using markers LIKE '%"' || `@marker_id` || '"%') is not indexable and may perform
poorly on large VTXO tables; update the SQL comment for both query definitions
to explicitly document this limitation and add a short "Future optimization"
note suggesting migrating to a JSON-aware index/search (or rewriting to use
json_each on the underlying table) when sqlc/view support permits so future
maintainers know why LIKE is used and how to improve it.
- Around line 262-281: Both SelectExpiringLiquidityAmount and
SelectRecoverableLiquidityAmount repeatedly call json_each(v.markers) which
forces JSON parsing per vtxo row and will degrade at scale; add a short comment
above these queries documenting the performance risk and expected data sizes,
and either (a) denormalize markers into a dedicated vtxo_marker table and join
on sm.marker_id with an index for fast lookup, or (b) if your SQLite build
supports a JSON-indexing extension, add an appropriate index on the markers
column; reference v.markers, swept_marker.sm.marker_id, and the two query names
when applying the change so reviewers can find and validate the optimization.
In `@internal/infrastructure/db/sqlite/vtxo_repo.go`:
- Around line 573-583: The parseMarkersJSONFromVtxo function currently swallows
json.Unmarshal errors; change it to log a warning when unmarshal fails so data
corruption is visible: in parseMarkersJSONFromVtxo, when json.Unmarshal returns
an error, call the project's logging facility (e.g., log.Printf or the
repository logger) to emit a warning that includes the error and a
truncated/sanitized markersJSON value (to aid debugging without leaking
sensitive data), then continue to return nil (or an empty slice if preferred by
callers). Ensure the log message references parseMarkersJSONFromVtxo and
includes the error and the markersJSON snippet.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 27894188-09fc-45f9-8457-7b9e080f0e53
⛔ Files ignored due to path filters (2)
api-spec/protobuf/gen/ark/v1/indexer.pb.gois excluded by!**/*.pb.go,!**/gen/**api-spec/protobuf/gen/ark/v1/types.pb.gois excluded by!**/*.pb.go,!**/gen/**
📒 Files selected for processing (55)
api-spec/openapi/swagger/ark/v1/indexer.openapi.jsonapi-spec/openapi/swagger/ark/v1/service.openapi.jsonapi-spec/openapi/swagger/ark/v1/types.openapi.jsonapi-spec/protobuf/ark/v1/indexer.protoapi-spec/protobuf/ark/v1/types.protodocker-compose.regtest.ymlinternal/config/config.gointernal/core/application/indexer.gointernal/core/application/indexer_bench_test.gointernal/core/application/indexer_exposure_test.gointernal/core/application/indexer_test.gointernal/core/application/service.gointernal/core/application/sweeper.gointernal/core/application/sweeper_test.gointernal/core/application/token_cache.gointernal/core/application/types.gointernal/core/application/utils.gointernal/core/application/utils_test.gointernal/core/domain/marker.gointernal/core/domain/marker_test.gointernal/core/domain/offchain_tx.gointernal/core/domain/offchain_tx_event.gointernal/core/domain/offchain_tx_repo.gointernal/core/domain/offchain_tx_test.gointernal/core/domain/vtxo.gointernal/infrastructure/db/badger/ark_repo.gointernal/infrastructure/db/badger/marker_repo.gointernal/infrastructure/db/badger/vtxo_repo.gointernal/infrastructure/db/postgres/marker_repo.gointernal/infrastructure/db/postgres/migration/20260210100000_add_depth_and_markers.up.sqlinternal/infrastructure/db/postgres/migration/20260219082956_fix_vtxo_vw_perf.up.sqlinternal/infrastructure/db/postgres/migration/20260409140000_checkpoint_tx_offchain_txid_index.down.sqlinternal/infrastructure/db/postgres/migration/20260409140000_checkpoint_tx_offchain_txid_index.up.sqlinternal/infrastructure/db/postgres/offchain_tx_repo.gointernal/infrastructure/db/postgres/sqlc/queries/models.gointernal/infrastructure/db/postgres/sqlc/queries/query.sql.gointernal/infrastructure/db/postgres/sqlc/query.sqlinternal/infrastructure/db/postgres/vtxo_repo.gointernal/infrastructure/db/service.gointernal/infrastructure/db/service_test.gointernal/infrastructure/db/sqlite/marker_repo.gointernal/infrastructure/db/sqlite/migration/20260210000000_add_depth_and_markers.up.sqlinternal/infrastructure/db/sqlite/migration/20260219082956_fix_vtxo_vw_perf.up.sqlinternal/infrastructure/db/sqlite/migration/20260409140000_checkpoint_tx_offchain_txid_index.down.sqlinternal/infrastructure/db/sqlite/migration/20260409140000_checkpoint_tx_offchain_txid_index.up.sqlinternal/infrastructure/db/sqlite/offchain_tx_repo.gointernal/infrastructure/db/sqlite/round_repo.gointernal/infrastructure/db/sqlite/sqlc/queries/models.gointernal/infrastructure/db/sqlite/sqlc/queries/query.sql.gointernal/infrastructure/db/sqlite/sqlc/query.sqlinternal/infrastructure/db/sqlite/vtxo_repo.gointernal/interface/grpc/handlers/indexer.gointernal/interface/grpc/handlers/parser.gointernal/interface/grpc/handlers/parser_test.gointernal/test/e2e/vtxo_chain_test.go
✅ Files skipped from review due to trivial changes (10)
- internal/infrastructure/db/sqlite/migration/20260409140000_checkpoint_tx_offchain_txid_index.up.sql
- internal/infrastructure/db/postgres/migration/20260409140000_checkpoint_tx_offchain_txid_index.down.sql
- internal/infrastructure/db/sqlite/migration/20260409140000_checkpoint_tx_offchain_txid_index.down.sql
- internal/infrastructure/db/postgres/migration/20260409140000_checkpoint_tx_offchain_txid_index.up.sql
- internal/core/domain/offchain_tx_event.go
- docker-compose.regtest.yml
- internal/infrastructure/db/sqlite/round_repo.go
- api-spec/openapi/swagger/ark/v1/service.openapi.json
- api-spec/protobuf/ark/v1/types.proto
- internal/infrastructure/db/sqlite/offchain_tx_repo.go
🚧 Files skipped from review as they are similar to previous changes (10)
- internal/interface/grpc/handlers/parser.go
- api-spec/openapi/swagger/ark/v1/types.openapi.json
- internal/core/application/utils.go
- api-spec/protobuf/ark/v1/indexer.proto
- internal/core/domain/marker.go
- internal/infrastructure/db/postgres/sqlc/queries/models.go
- internal/infrastructure/db/postgres/migration/20260210100000_add_depth_and_markers.up.sql
- internal/core/application/sweeper_test.go
- internal/infrastructure/db/postgres/vtxo_repo.go
- internal/infrastructure/db/sqlite/marker_repo.go
| // Update Swept field on VTXOs that contain this marker. | ||
| // This keeps the stored Swept field in sync for query compatibility. | ||
| // Errors here are non-fatal since swept_marker is already recorded. | ||
| var filteredDtos []vtxoDTO | ||
| if err := r.vtxoStore.Find( | ||
| &filteredDtos, | ||
| badgerhold.Where("MarkerIDs").Contains(markerID), | ||
| ); err != nil { | ||
| return nil |
There was a problem hiding this comment.
This update cannot be best-effort while reads still trust dto.Swept.
Once swept_marker is inserted, the source of truth says the VTXO is swept. But if any of these follow-up vtxoStore.Update(...) calls fail, the Badger getters in this file still return the stale stored Swept=false value because they read dto.Vtxo directly. That makes Badger diverge from the Postgres/SQLite behavior on transient write conflicts.
Either make the VTXO update part of the write contract here, or compute Swept dynamically from sweptMarkerStore on read.
Also applies to: 269-284
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/infrastructure/db/badger/marker_repo.go` around lines 258 - 266, The
current code does a best-effort update of VTXO DTOs (using vtxoStore.Find ->
vtxoStore.Update on filteredDtos / vtxoDTO) after inserting swept_marker, but
readers still return the stale dto.Swept value if those updates fail; fix by
either making the VTXO updates part of the write contract (perform the
vtxoStore.Update calls inside the same transactional operation and
propagate/return errors from the marker insert/update so failures abort) or stop
storing Swept in the DTO and compute it dynamically on read (modify the Badger
getters that read dto.Vtxo/dto.Swept to consult sweptMarkerStore for marker
existence and set Swept accordingly), and apply the same change for the other
similar block at 269-284.
| // parseMarkersJSONB parses a JSONB array into a slice of strings | ||
| func parseMarkersJSONB(markers json.RawMessage) []string { | ||
| if len(markers) == 0 { | ||
| return nil | ||
| } | ||
| var markerIDs []string | ||
| if err := json.Unmarshal(markers, &markerIDs); err != nil { | ||
| return nil | ||
| } | ||
| return markerIDs |
There was a problem hiding this comment.
Don't silently treat corrupt marker JSON as “no markers”.
Returning nil on unmarshal failure makes a bad markers payload behave like an empty marker set, which can break chain preloading and swept resolution without surfacing any error. This helper should return an error so callers can fail fast instead of silently changing semantics.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/infrastructure/db/postgres/marker_repo.go` around lines 442 - 451,
parseMarkersJSONB currently returns nil on JSON unmarshal errors which silently
treats corrupted marker payloads as "no markers"; change parseMarkersJSONB to
return ([]string, error), return an explicit error when json.Unmarshal fails
(and return an empty slice for empty input if desired), and update all call
sites that call parseMarkersJSONB to handle the error (propagate or fail fast)
instead of assuming a nil/empty slice; search for parseMarkersJSONB references
to update their signatures and error handling accordingly.
| EXISTS ( | ||
| SELECT 1 FROM swept_marker sm | ||
| WHERE v.markers @> jsonb_build_array(sm.marker_id) | ||
| ) AS swept, |
There was a problem hiding this comment.
Avoid scanning swept_marker for every VTXO row.
This EXISTS is written from the swept_marker side, so PostgreSQL has to test each swept marker against the outer row’s JSONB array. On a large swept_marker table that becomes an O(rows × swept_markers) check inside the view. Expanding v.markers and joining on sm.marker_id keeps the lookup on the indexed key instead.
Suggested rewrite
- EXISTS (
- SELECT 1 FROM swept_marker sm
- WHERE v.markers @> jsonb_build_array(sm.marker_id)
- ) AS swept,
+ EXISTS (
+ SELECT 1
+ FROM jsonb_array_elements_text(v.markers) AS vm(marker_id)
+ JOIN swept_marker sm ON sm.marker_id = vm.marker_id
+ ) AS swept,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@internal/infrastructure/db/postgres/migration/20260219082956_fix_vtxo_vw_perf.up.sql`
around lines 23 - 26, The current EXISTS subquery scans swept_marker for each v
row; replace it by unnesting v.markers (e.g. using
jsonb_array_elements_text(v.markers) or a LATERAL join) and directly joining the
resulting marker values to swept_marker.marker_id so the planner can use the
swept_marker index; update the view to compute swept as a boolean from that join
(e.g. existence via JOIN + DISTINCT or boolean aggregate like bool_or)
referencing v.markers, swept_marker.marker_id and the swept_marker table instead
of the current EXISTS(...) pattern.
| -- name: GetDescendantMarkerIds :many | ||
| -- Recursively get a marker and all its descendants (markers whose parent_markers contain it) | ||
| WITH RECURSIVE descendant_markers(id) AS ( | ||
| -- Base case: the marker being swept | ||
| SELECT marker.id FROM marker WHERE marker.id = @root_marker_id | ||
| UNION ALL | ||
| -- Recursive case: find markers whose parent_markers jsonb array contains any descendant | ||
| SELECT m.id FROM marker m | ||
| INNER JOIN descendant_markers dm ON ( | ||
| m.parent_markers @> jsonb_build_array(dm.id) | ||
| ) | ||
| ) | ||
| SELECT descendant_markers.id AS marker_id FROM descendant_markers | ||
| WHERE descendant_markers.id NOT IN (SELECT sm.marker_id FROM swept_marker sm); |
There was a problem hiding this comment.
Guard recursive descendant traversal against marker cycles.
This CTE has no visited set. If bad data ever creates A -> B -> A, GetDescendantMarkerIds can recurse forever and wedge SweepMarkerWithDescendants.
🛡️ Add cycle protection to the recursive CTE
-WITH RECURSIVE descendant_markers(id) AS (
+WITH RECURSIVE descendant_markers(id, path) AS (
-- Base case: the marker being swept
- SELECT marker.id FROM marker WHERE marker.id = `@root_marker_id`
+ SELECT marker.id, ARRAY[marker.id]::text[]
+ FROM marker
+ WHERE marker.id = `@root_marker_id`
UNION ALL
-- Recursive case: find markers whose parent_markers jsonb array contains any descendant
- SELECT m.id FROM marker m
+ SELECT m.id, dm.path || m.id
+ FROM marker m
INNER JOIN descendant_markers dm ON (
m.parent_markers @> jsonb_build_array(dm.id)
)
+ WHERE NOT m.id = ANY(dm.path)
)
-SELECT descendant_markers.id AS marker_id FROM descendant_markers
+SELECT DISTINCT descendant_markers.id AS marker_id
+FROM descendant_markers
WHERE descendant_markers.id NOT IN (SELECT sm.marker_id FROM swept_marker sm);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| -- name: GetDescendantMarkerIds :many | |
| -- Recursively get a marker and all its descendants (markers whose parent_markers contain it) | |
| WITH RECURSIVE descendant_markers(id) AS ( | |
| -- Base case: the marker being swept | |
| SELECT marker.id FROM marker WHERE marker.id = @root_marker_id | |
| UNION ALL | |
| -- Recursive case: find markers whose parent_markers jsonb array contains any descendant | |
| SELECT m.id FROM marker m | |
| INNER JOIN descendant_markers dm ON ( | |
| m.parent_markers @> jsonb_build_array(dm.id) | |
| ) | |
| ) | |
| SELECT descendant_markers.id AS marker_id FROM descendant_markers | |
| WHERE descendant_markers.id NOT IN (SELECT sm.marker_id FROM swept_marker sm); | |
| -- name: GetDescendantMarkerIds :many | |
| -- Recursively get a marker and all its descendants (markers whose parent_markers contain it) | |
| WITH RECURSIVE descendant_markers(id, path) AS ( | |
| -- Base case: the marker being swept | |
| SELECT marker.id, ARRAY[marker.id]::text[] | |
| FROM marker | |
| WHERE marker.id = `@root_marker_id` | |
| UNION ALL | |
| -- Recursive case: find markers whose parent_markers jsonb array contains any descendant | |
| SELECT m.id, dm.path || m.id | |
| FROM marker m | |
| INNER JOIN descendant_markers dm ON ( | |
| m.parent_markers @> jsonb_build_array(dm.id) | |
| ) | |
| WHERE NOT m.id = ANY(dm.path) | |
| ) | |
| SELECT DISTINCT descendant_markers.id AS marker_id | |
| FROM descendant_markers | |
| WHERE descendant_markers.id NOT IN (SELECT sm.marker_id FROM swept_marker sm); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/infrastructure/db/postgres/sqlc/query.sql` around lines 475 - 488,
GetDescendantMarkerIds' recursive CTE can cycle; modify the CTE
(descendant_markers) to carry a visited set/path (e.g., an array or jsonb of
marker ids) and only recurse into rows whose id is not already in that visited
set, appending the new id to visited on each recursion; keep the same final
SELECT filtering out ids in swept_marker and preserve the base case (SELECT
marker.id) but initialize visited with that id so cycles like A->B->A are
prevented.
|
@arkanaai @coderabbitai review |
|
✅ Actions performedReview triggered.
|
Arkana Code Review — PR #908: Scale the DAGReviewed the full diff. This is a large and well-thought-out optimization. The marker DAG design is sound and the benchmark numbers are compelling. A few things need attention before merge. 🚨 Protocol-Critical — Requires Human Sign-Off1. Sweep path change in The checkpoint sweep now routes through 2. Replacing the direct
🐛 Bug: Down Migration Doesn't Restore
|
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
internal/infrastructure/db/postgres/sqlc/queries/query.sql.go (2)
1-15:⚠️ Potential issue | 🟡 MinorReminder: generated file — propagate fixes via the
.sqlsource.Header marks this as sqlc-generated (
DO NOT EDIT). Any adjustment suggested on this file (e.g., missing casts, query shape) must be made ininternal/infrastructure/db/postgres/sqlc/query.sqland regenerated withsqlc generate, otherwise the next regeneration will revert manual edits.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infrastructure/db/postgres/sqlc/queries/query.sql.go` around lines 1 - 15, This is a generated sqlc file (package queries) and must not be edited directly; make any changes to the SQL source file internal/infrastructure/db/postgres/sqlc/query.sql (adjust query shapes, casts, or types that produced the missing/incorrect Go code) and then run sqlc generate to regenerate query.sql.go so changes to functions/types in package queries are preserved; do not modify query.sql.go manually.
2720-2743:⚠️ Potential issue | 🟠 MajorAdd
::jsonbcast to@markersparameter inUpsertVtxoquery for consistency withUpdateVtxoMarkers.The source
query.sqlshowsUpsertVtxopasses@markers(json.RawMessage) without a cast, while the identically-typedUpdateVtxoMarkersexplicitly casts with@markers::jsonb. Since both statements assign to the same jsonbmarkerscolumn and use json.RawMessage, the inconsistency creates fragility: INSERT/UPSERT operations may fail when lib/pq's parameter-type inference does not handle the uncast json.RawMessage as jsonb.Fix in
internal/infrastructure/db/postgres/sqlc/query.sql: cast the markers parameter to jsonb in the VALUES clause (similar to line 2349 in UpdateVtxoMarkers).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infrastructure/db/postgres/sqlc/queries/query.sql.go` around lines 2720 - 2743, The UpsertVtxo query is passing the markers parameter without an explicit type which can break when lib/pq doesn't infer jsonb; update the VALUES clause of the UpsertVtxo statement to cast the markers parameter to jsonb (i.e., use the same `@markers`::jsonb style used in UpdateVtxoMarkers) so the markers column (jsonb) is always supplied as jsonb and the ON CONFLICT/UPDATE assignment to markers continues to work reliably.
🧹 Nitpick comments (6)
internal/core/application/service.go (1)
1100-1125: SortparentMarkerIDsbefore emitting the accepted event.This slice is built from map iteration, so its order is nondeterministic. The same logical parent set can therefore serialize differently across runs, which is brittle once it is persisted in events/JSON.
♻️ Suggested change
parentMarkerIDs := make([]string, 0, len(parentMarkerSet)) for id := range parentMarkerSet { parentMarkerIDs = append(parentMarkerIDs, id) } + slices.Sort(parentMarkerIDs) change, err := offchainTx.Accept( fullySignedArkTx, signedCheckpointTxsMap,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/core/application/service.go` around lines 1100 - 1125, The parentMarkerIDs slice is populated from parentMarkerSet (map iteration) and thus has nondeterministic order; sort parentMarkerIDs (e.g., call sort.Strings(parentMarkerIDs)) after building it and before calling offchainTx.Accept so the emitted accepted event is deterministic, and add the required import for the sort package if missing; reference variables: parentMarkerSet, parentMarkerIDs and the call offchainTx.Accept.internal/infrastructure/db/sqlite/marker_repo.go (2)
142-185: "Bulk" sweep methods are actually per-row inserts inside a transaction.PR objectives explicitly call for "bulk INSERTs into
swept_markers" to keep sweeping cheap at 20k+ depth. In this SQLite implementation bothBulkSweepMarkers(142–162) andSweepVtxoOutpoints(164–185) loop oneINSERTper element. For a window covering thousands of markers/outpoints this becomes thousands of round-trips even though it's a single transaction. SQLite supports multi-VALUES inserts — prefer a single statement per batch (chunked toSQLITE_MAX_VARIABLE_NUMBER/ params-per-row):
- Add a sqlc query such as
BulkInsertSweptMarkersthat takessqlc.sliceand performsINSERT ... VALUES (?, ?), (?, ?), ... ON CONFLICT DO NOTHING, or- Build the multi-row statement dynamically in Go and execute once per chunk.
Same guidance for
CreateRootMarkersForVtxos(lines 314–340).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infrastructure/db/sqlite/marker_repo.go` around lines 142 - 185, Both BulkSweepMarkers and SweepVtxoOutpoints currently perform per-row INSERTs inside a transaction; change them to perform batched multi-row INSERTs instead (chunked to SQLITE_MAX_VARIABLE_NUMBER / params-per-row) to avoid thousands of round-trips: either add sqlc queries like BulkInsertSweptMarkers that accept slices and emit INSERT ... VALUES (...),(...),... ON CONFLICT DO NOTHING, or dynamically build and exec a single multi-row statement per chunk in the markerRepository methods (BulkSweepMarkers and SweepVtxoOutpoints); apply the same batching approach to CreateRootMarkersForVtxos so all large sweeps use chunked multi-value INSERTs rather than per-row inserts.
380-411:GetVtxoChainByMarkersissues one full-scanLIKEquery per marker.Combined with the SQLite query that already scans
vtxo_vw, calling this with N markers produces N full table scans plus an in-memory dedup. ForpreloadByMarkersthis scales with chain depth (N/MarkerInterval), so the "bulk fetch" benefit narrows on large chains.Options, in order of effort:
- Add a sqlc query that accepts
sqlc.slice('ids')and unions the per-marker sub-queries (still LIKE but one round-trip).- Re-express using
EXISTS (SELECT 1 FROM json_each(markers) j WHERE j.value IN (…))inside a CTE — single statement, more accurate matching, still scan-bound in SQLite but avoids the Go-side fanout.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infrastructure/db/sqlite/marker_repo.go` around lines 380 - 411, GetVtxoChainByMarkers currently issues one LIKE query per marker causing N full scans; replace the per-marker loop with a single SQL query and update the method to call it. Add a new sqlc query (e.g., SelectVtxoChainByMarkers(ids sqlc.slice('ids'))) that accepts a slice of marker IDs and returns distinct vtxo rows (use UNION of per-marker subqueries or a single query with json_each/EXISTS and SELECT DISTINCT txid, vout to dedupe). Then update markerRepository.GetVtxoChainByMarkers to call querier.SelectVtxoChainByMarkers once with the markerIDs slice, map the returned rows via rowToVtxoFromChainQuery, and remove the per-marker loop and seen map.internal/infrastructure/db/sqlite/sqlc/query.sql (1)
548-550: Parameter style inconsistency inInsertSweptVtxo.Uses positional
?placeholders while every other statement in this file uses named@parameters (e.g.,InsertSweptMarkerright above). sqlc will still generate it, but the positional form yields parameter names likeColumn1/Column2/Column3in Go instead ofTxid/Vout/SweptAt. Align with the rest of the file.♻️ Proposed diff
-- name: InsertSweptVtxo :exec INSERT OR IGNORE INTO swept_vtxo (txid, vout, swept_at) -VALUES (?, ?, ?); +VALUES (`@txid`, `@vout`, `@swept_at`);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infrastructure/db/sqlite/sqlc/query.sql` around lines 548 - 550, Replace the positional parameter placeholders in the SQL statement for InsertSweptVtxo with named @ parameters to match the rest of the file; specifically change the VALUES (?, ?, ?) usage to use `@txid`, `@vout`, `@swept_at` (matching columns txid, vout, swept_at) so sqlc generates Go params named Txid/Vout/SweptAt instead of Column1/2/3.internal/infrastructure/db/postgres/sqlc/query.sql (1)
475-488: Cycle handling in the recursive CTE — functionally safe, but worth a comment.Unlike SQLite where the past review comment was addressed via
json_each, this PG version uses plainUNION(notUNION ALL), so Postgres' recursive-CTE semantics deduplicate the working set and a cycle like A → B → A self-terminates. That's subtle; a one-line SQL comment stating "UNION (set semantics) handles cycles by filtering rows already produced" would prevent a future refactor toUNION ALLfrom silently introducing an infinite loop.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infrastructure/db/postgres/sqlc/query.sql` around lines 475 - 488, The recursive CTE GetDescendantMarkerIds relies on UNION's set semantics to deduplicate rows and thus prevent infinite loops on cycles; add a concise SQL comment above the WITH RECURSIVE descendant_markers(...) noting that "UNION (set semantics) handles cycles by filtering rows already produced, do not change to UNION ALL" so future maintainers won't convert to UNION ALL and accidentally reintroduce infinite recursion; reference descendant_markers and the use of UNION in the comment.internal/infrastructure/db/sqlite/sqlc/queries/query.sql.go (1)
100-113:NOT INagainstswept_markeris fine today but fragile; preferNOT EXISTS/ anti-join.
GetDescendantMarkerIdsterminates withWHERE descendant_markers.id NOT IN (SELECT sm.marker_id FROM swept_marker sm). Correctness today relies onswept_marker.marker_idbeingNOT NULL(PK) — if that ever changes or a view is substituted, a single NULL in the subquery collapses the result to empty silently.NOT EXISTSis also typically better for SQLite's planner on largerswept_markertables. Same applies to the Postgres twin (Line 147 inpostgres/sqlc/queries/query.sql.go).Suggested query shape (apply in both `.sql` sources, regenerate)
-SELECT descendant_markers.id AS marker_id FROM descendant_markers -WHERE descendant_markers.id NOT IN (SELECT sm.marker_id FROM swept_marker sm) +SELECT dm.id AS marker_id +FROM descendant_markers dm +WHERE NOT EXISTS ( + SELECT 1 FROM swept_marker sm WHERE sm.marker_id = dm.id +)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/infrastructure/db/sqlite/sqlc/queries/query.sql.go` around lines 100 - 113, The recursive query GetDescendantMarkerIds uses "WHERE descendant_markers.id NOT IN (SELECT sm.marker_id FROM swept_marker sm)" which is fragile if sm.marker_id can ever be NULL; change this to an anti-join using NOT EXISTS (e.g. WHERE NOT EXISTS (SELECT 1 FROM swept_marker sm WHERE sm.marker_id = descendant_markers.id)) or a LEFT JOIN ... WHERE sm.marker_id IS NULL to avoid NULL-subquery issues; update the getDescendantMarkerIds SQL in query.sql.go (referencing descendant_markers, swept_marker, marker_id) and make the corresponding change in the Postgres twin so you can regenerate the sqlc artifacts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/core/application/indexer.go`:
- Around line 611-620: The preload step currently aborts on any marker-related
error; change preloadByMarkers usage so transient marker repo errors (from
MarkerRepo.GetVtxoChainByMarkers or MarkerRepo.GetMarkersByIds surfaced inside
preloadByMarkers) are logged and ignored rather than returned: inside
preloadByMarkers (or at the call site in indexer.go where preloadByMarkers is
invoked), catch errors from marker lookups, call the existing logger to record
the error and context, and continue without returning an error so the caller can
proceed relying on per-hop Vtxos().GetVtxos + ensureVtxosCached fallback; keep
real failures from Vtxos().GetVtxos propagated as before but ensure
marker-retrieval failures do not hard-abort GetVtxoChain.
- Around line 612-623: The code is redundantly calling
repoManager.Vtxos().GetVtxos(frontier) twice: once before the loop and again
inside ensureVtxosCached; to fix, have the initial GetVtxos result be used to
seed the vtxoCache/loadedMarkers so ensureVtxosCached can skip re-fetching the
same outpoints — specifically, when Markers() != nil capture startVtxos returned
by Vtxos().GetVtxos and pass those into preloadByMarkers and also insert them
into vtxoCache (and mark them in loadedMarkers) so that the subsequent call path
in ensureVtxosCached sees them as cached and does not call Vtxos().GetVtxos for
the same nextVtxos again.
In `@internal/core/application/service.go`:
- Around line 372-378: The current code in the block checking len(spentVtxos) !=
len(spentVtxoKeys) returns early, which silently drops propagation for a
finalized offchain tx — instead, on a short/partial parent read either
retry/fail this path (propagate an error so the caller can requeue/retry) or
fall back to the accepted event metadata and continue emitting the
TransactionEvent and starting the watcher; specifically, replace the early
return in the block that references spentVtxos, spentVtxoKeys and txid with
logic that (a) attempts a retry/read-backoff (or returns an explicit error) and
if retries are exhausted, (b) constructs/uses the metadata present on the
accepted event to emit the TransactionEvent and invoke the watcher startup path
so outputs remain tracked rather than dropped.
In `@internal/core/application/sweeper.go`:
- Around line 767-778: GetAllChildrenVtxos is being called with only vtxo.Txid
which returns descendants of the whole transaction and can include siblings;
change the lookup to be outpoint-scoped and pass only descendants for that
output. Replace the call to s.repoManager.Vtxos().GetAllChildrenVtxos(ctx,
vtxo.Txid) with an outpoint-aware query (e.g.,
s.repoManager.Vtxos().GetAllChildrenVtxosByOutpoint(ctx, vtxo.Txid, vtxo.Vout)
or extend GetAllChildrenVtxos to accept an outpoint struct), ensure the
resulting childrenVtxos only include descendants of that txid:vout, then call
s.repoManager.Markers().SweepVtxoOutpoints(ctx, childrenVtxos, sweptAt) as
before; update the Vtxos repo implementation accordingly to perform
txid+vout-scoped traversal.
In
`@internal/infrastructure/db/postgres/migration/20260416120000_add_swept_vtxo.down.sql`:
- Around line 1-54: The down migration currently drops swept_vtxo and recreates
vtxo_vw/intent_with_inputs_vw without restoring checkpoint-sweep state, silently
resurrecting swept VTXOs; instead either implement a translation that recreates
swept_vtxo from swept_marker/markers before dropping it or make the migration
explicitly irreversible. Modify 20260416120000_add_swept_vtxo.down.sql: remove
the DROP TABLE IF EXISTS swept_vtxo + view recreation path and replace with a
RAISE EXCEPTION 'irreversible migration: cannot restore swept_vtxo state' (or,
if you prefer a restore, create swept_vtxo and populate it by joining
swept_marker to vtxo via markers before recreating vtxo_vw and
intent_with_inputs_vw) so vtxo_vw, intent_with_inputs_vw and swept_vtxo
consistency is preserved.
In `@internal/infrastructure/db/sqlite/marker_repo.go`:
- Around line 284-312: Wrap the three DB calls in SweepVtxosByMarker (GetMarker,
CountUnsweptVtxosByMarkerId, InsertSweptMarker) inside a single transaction
using the repository's execTx helper: start a transaction, run GetMarker, then
CountUnsweptVtxosByMarkerId, then InsertSweptMarker within that tx context,
commit on success and rollback on error, and return the counted value; this
ensures the returned count matches the sweep insert atomically for
metrics/accounting while keeping the existing idempotent insert behavior.
- Around line 540-549: parseMarkersJSON currently swallows json.Unmarshal errors
and returns nil, making corrupt markers indistinguishable from empty ones;
change parseMarkersJSON(markersJSON string) to return ([]string, error) instead
of silently returning nil, have it return (nil, err) when json.Unmarshal fails
(or (nil, nil) when markersJSON==""), and update all callers of parseMarkersJSON
to handle the error (log at warn/error or propagate) so corrupt markers are
surfaced rather than treated as empty.
In `@internal/infrastructure/db/sqlite/sqlc/queries/query.sql.go`:
- Around line 93-98: The marker-ID nullability is inconsistent across SQLite
queries (CountUnsweptVtxosByMarkerId, SelectVtxosByMarkerId vs
SelectVtxoChainByMarker); make them consistent by ensuring the sql.NullString
parameter's Valid flag is set deterministically before use—update
CountUnsweptVtxosByMarkerId and SelectVtxosByMarkerId to set markerID.Valid =
len(markerID.String) > 0 (or markerID.Valid = markerID.String != "") before
calling q.db.QueryRowContext / q.db.QueryContext so all three functions treat
empty IDs the same, or alternatively change all three signatures to accept plain
string and adjust the SQL (CAST/COALESCE) so sqlc emits non-nullable strings;
pick one approach and apply it to the named functions to restore consistency.
---
Outside diff comments:
In `@internal/infrastructure/db/postgres/sqlc/queries/query.sql.go`:
- Around line 1-15: This is a generated sqlc file (package queries) and must not
be edited directly; make any changes to the SQL source file
internal/infrastructure/db/postgres/sqlc/query.sql (adjust query shapes, casts,
or types that produced the missing/incorrect Go code) and then run sqlc generate
to regenerate query.sql.go so changes to functions/types in package queries are
preserved; do not modify query.sql.go manually.
- Around line 2720-2743: The UpsertVtxo query is passing the markers parameter
without an explicit type which can break when lib/pq doesn't infer jsonb; update
the VALUES clause of the UpsertVtxo statement to cast the markers parameter to
jsonb (i.e., use the same `@markers`::jsonb style used in UpdateVtxoMarkers) so
the markers column (jsonb) is always supplied as jsonb and the ON
CONFLICT/UPDATE assignment to markers continues to work reliably.
---
Nitpick comments:
In `@internal/core/application/service.go`:
- Around line 1100-1125: The parentMarkerIDs slice is populated from
parentMarkerSet (map iteration) and thus has nondeterministic order; sort
parentMarkerIDs (e.g., call sort.Strings(parentMarkerIDs)) after building it and
before calling offchainTx.Accept so the emitted accepted event is deterministic,
and add the required import for the sort package if missing; reference
variables: parentMarkerSet, parentMarkerIDs and the call offchainTx.Accept.
In `@internal/infrastructure/db/postgres/sqlc/query.sql`:
- Around line 475-488: The recursive CTE GetDescendantMarkerIds relies on
UNION's set semantics to deduplicate rows and thus prevent infinite loops on
cycles; add a concise SQL comment above the WITH RECURSIVE
descendant_markers(...) noting that "UNION (set semantics) handles cycles by
filtering rows already produced, do not change to UNION ALL" so future
maintainers won't convert to UNION ALL and accidentally reintroduce infinite
recursion; reference descendant_markers and the use of UNION in the comment.
In `@internal/infrastructure/db/sqlite/marker_repo.go`:
- Around line 142-185: Both BulkSweepMarkers and SweepVtxoOutpoints currently
perform per-row INSERTs inside a transaction; change them to perform batched
multi-row INSERTs instead (chunked to SQLITE_MAX_VARIABLE_NUMBER /
params-per-row) to avoid thousands of round-trips: either add sqlc queries like
BulkInsertSweptMarkers that accept slices and emit INSERT ... VALUES
(...),(...),... ON CONFLICT DO NOTHING, or dynamically build and exec a single
multi-row statement per chunk in the markerRepository methods (BulkSweepMarkers
and SweepVtxoOutpoints); apply the same batching approach to
CreateRootMarkersForVtxos so all large sweeps use chunked multi-value INSERTs
rather than per-row inserts.
- Around line 380-411: GetVtxoChainByMarkers currently issues one LIKE query per
marker causing N full scans; replace the per-marker loop with a single SQL query
and update the method to call it. Add a new sqlc query (e.g.,
SelectVtxoChainByMarkers(ids sqlc.slice('ids'))) that accepts a slice of marker
IDs and returns distinct vtxo rows (use UNION of per-marker subqueries or a
single query with json_each/EXISTS and SELECT DISTINCT txid, vout to dedupe).
Then update markerRepository.GetVtxoChainByMarkers to call
querier.SelectVtxoChainByMarkers once with the markerIDs slice, map the returned
rows via rowToVtxoFromChainQuery, and remove the per-marker loop and seen map.
In `@internal/infrastructure/db/sqlite/sqlc/queries/query.sql.go`:
- Around line 100-113: The recursive query GetDescendantMarkerIds uses "WHERE
descendant_markers.id NOT IN (SELECT sm.marker_id FROM swept_marker sm)" which
is fragile if sm.marker_id can ever be NULL; change this to an anti-join using
NOT EXISTS (e.g. WHERE NOT EXISTS (SELECT 1 FROM swept_marker sm WHERE
sm.marker_id = descendant_markers.id)) or a LEFT JOIN ... WHERE sm.marker_id IS
NULL to avoid NULL-subquery issues; update the getDescendantMarkerIds SQL in
query.sql.go (referencing descendant_markers, swept_marker, marker_id) and make
the corresponding change in the Postgres twin so you can regenerate the sqlc
artifacts.
In `@internal/infrastructure/db/sqlite/sqlc/query.sql`:
- Around line 548-550: Replace the positional parameter placeholders in the SQL
statement for InsertSweptVtxo with named @ parameters to match the rest of the
file; specifically change the VALUES (?, ?, ?) usage to use `@txid`, `@vout`,
`@swept_at` (matching columns txid, vout, swept_at) so sqlc generates Go params
named Txid/Vout/SweptAt instead of Column1/2/3.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8faa340e-f86d-44fa-80ef-cb16a7d8d6d1
📒 Files selected for processing (25)
internal/core/application/indexer.gointernal/core/application/indexer_bench_test.gointernal/core/application/indexer_test.gointernal/core/application/service.gointernal/core/application/sweeper.gointernal/core/application/sweeper_test.gointernal/core/domain/marker_repo.gointernal/infrastructure/db/badger/marker_repo.gointernal/infrastructure/db/postgres/marker_repo.gointernal/infrastructure/db/postgres/migration/20260416120000_add_swept_vtxo.down.sqlinternal/infrastructure/db/postgres/migration/20260416120000_add_swept_vtxo.up.sqlinternal/infrastructure/db/postgres/sqlc/queries/models.gointernal/infrastructure/db/postgres/sqlc/queries/query.sql.gointernal/infrastructure/db/postgres/sqlc/query.sqlinternal/infrastructure/db/postgres/vtxo_repo.gointernal/infrastructure/db/service.gointernal/infrastructure/db/service_test.gointernal/infrastructure/db/sqlite/marker_repo.gointernal/infrastructure/db/sqlite/migration/20260416120000_add_swept_vtxo.down.sqlinternal/infrastructure/db/sqlite/migration/20260416120000_add_swept_vtxo.up.sqlinternal/infrastructure/db/sqlite/round_repo.gointernal/infrastructure/db/sqlite/sqlc/queries/models.gointernal/infrastructure/db/sqlite/sqlc/queries/query.sql.gointernal/infrastructure/db/sqlite/sqlc/query.sqlinternal/infrastructure/db/sqlite/vtxo_repo.go
✅ Files skipped from review due to trivial changes (1)
- internal/core/domain/marker_repo.go
🚧 Files skipped from review as they are similar to previous changes (5)
- internal/infrastructure/db/postgres/marker_repo.go
- internal/infrastructure/db/postgres/sqlc/queries/models.go
- internal/core/application/indexer_bench_test.go
- internal/infrastructure/db/service.go
- internal/infrastructure/db/badger/marker_repo.go
| // Determine page size. | ||
| // Backward compat: nil page + empty token → return full chain (no pagination). | ||
| pageSize := math.MaxInt32 | ||
| if page != nil { | ||
| pageSize = int(page.PageSize) | ||
| if pageSize <= 0 { | ||
| pageSize = maxPageSizeVtxoChain | ||
| } | ||
| if _, ok := outpoints[outpoint.String()]; !ok { | ||
| return nil, fmt.Errorf("auth token is not for outpoint %s", outpoint) | ||
| } else if pageToken != "" { | ||
| pageSize = maxPageSizeVtxoChain | ||
| } | ||
|
|
||
| // Determine frontier: decode pageToken, or use [vtxoKey] for first page. | ||
| var frontier []domain.Outpoint | ||
| if pageToken != "" { | ||
| decoded, err := i.decodeChainCursor(pageToken) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("invalid page_token: %w", err) | ||
| } | ||
| frontier = decoded | ||
| } else { | ||
| frontier = []domain.Outpoint{vtxoKey} | ||
| } |
There was a problem hiding this comment.
Backward-compat fallback leaves GetVtxoChain memory-unbounded.
When both page == nil and pageToken == "", pageSize becomes math.MaxInt32, so walkVtxoChain collects the full chain into chain/allOutpoints/visited/vtxoCache. For the 20k+ depths this PR targets, a public-exposure call without pagination can still OOM the server. The prior reviewer's thread already flagged the scalability concern for GetVtxoChainByIntent; the same hazard applies here for legacy clients. Consider defaulting to maxPageSizeVtxoChain and emitting a NextPageToken, then documenting the change, rather than preserving unbounded traversal.
| if i.repoManager.Markers() != nil { | ||
| startVtxos, err := i.repoManager.Vtxos().GetVtxos(ctx, nextVtxos) | ||
| if err != nil { | ||
| return nil, nil, err | ||
| return nil, nil, "", err | ||
| } | ||
| if err := i.preloadByMarkers(ctx, startVtxos, vtxoCache, offchainTxCache); err != nil { | ||
| return nil, nil, "", err | ||
| } | ||
| } | ||
|
|
||
| for len(nextVtxos) > 0 { | ||
| if err := i.ensureVtxosCached(ctx, nextVtxos, vtxoCache, loadedMarkers); err != nil { |
There was a problem hiding this comment.
Redundant GetVtxos for the frontier on the miss path.
When Markers() != nil but the frontier VTXO isn't persisted (e.g., bad client input), line 613 issues GetVtxos(frontier) which returns empty, preloadByMarkers is a no-op, and then ensureVtxosCached at line 623 issues GetVtxos(frontier) a second time for the same outpoints before ultimately erroring with "vtxo not found". Minor, but easy to avoid by seeding the cache with whatever the first call returned even on the empty result path — or by reusing the pre-fetch results in the main loop.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/core/application/indexer.go` around lines 612 - 623, The code is
redundantly calling repoManager.Vtxos().GetVtxos(frontier) twice: once before
the loop and again inside ensureVtxosCached; to fix, have the initial GetVtxos
result be used to seed the vtxoCache/loadedMarkers so ensureVtxosCached can skip
re-fetching the same outpoints — specifically, when Markers() != nil capture
startVtxos returned by Vtxos().GetVtxos and pass those into preloadByMarkers and
also insert them into vtxoCache (and mark them in loadedMarkers) so that the
subsequent call path in ensureVtxosCached sees them as cached and does not call
Vtxos().GetVtxos for the same nextVtxos again.
| if len(spentVtxos) != len(spentVtxoKeys) { | ||
| log.Warnf( | ||
| "incomplete parent read: got %d of %d spent vtxos for tx %s", | ||
| len(spentVtxos), len(spentVtxoKeys), txid, | ||
| ) | ||
| return | ||
| } |
There was a problem hiding this comment.
Don’t silently drop finalized offchain tx propagation on a short parent read.
Returning here means no TransactionEvent is emitted and no watcher is started for that finalized tx. A partial read now leaves those outputs untracked until some later recovery path happens to notice them. Please fail/retry this path, or fall back to metadata already carried by the accepted event instead of returning.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/core/application/service.go` around lines 372 - 378, The current
code in the block checking len(spentVtxos) != len(spentVtxoKeys) returns early,
which silently drops propagation for a finalized offchain tx — instead, on a
short/partial parent read either retry/fail this path (propagate an error so the
caller can requeue/retry) or fall back to the accepted event metadata and
continue emitting the TransactionEvent and starting the watcher; specifically,
replace the early return in the block that references spentVtxos, spentVtxoKeys
and txid with logic that (a) attempts a retry/read-backoff (or returns an
explicit error) and if retries are exhausted, (b) constructs/uses the metadata
present on the accepted event to emit the TransactionEvent and invoke the
watcher startup path so outputs remain tracked rather than dropped.
Closes #833
Summary
Introduces a marker DAG — sparse checkpoints placed every 100 depths in the VTXO chain — that enables bulk DB operations instead of per-VTXO queries. Combined with bulk offchain tx fetching (merged from #1005),
GetVtxoChainon a 10k-deep linear chain drops from thousands of sequential DB round-trips to ~200 bulk calls.What changed
markerandswept_markertables,depthandmarkersJSONB columns onvtxo, with GIN indexes. Markers are created at everyMarkerInterval = 100depths. Each VTXO carries the IDs of markers covering it.preloadByMarkers()walks the marker DAG upward from the frontier, bulk-loading all VTXO records intovtxoCachein ~N/100 calls instead of N individual fetches.GetOffchainTxsByTxidsbulk-fetches offchain txs per marker window intooffchainTxCache, so the BFS loop never hits the DB for individual offchain tx lookups.GetOffchainTxsByTxidsinstead of one-by-one.sweptboolean column is removed fromvtxo; swept status is now computed dynamically viaEXISTS (SELECT 1 FROM swept_marker sm WHERE v.markers @> jsonb_build_array(sm.marker_id)).BulkSweepMarkerssweeps all VTXOs under a marker in one operation.GetVtxoChain—pageToken/nextPageTokenusing base64-encoded frontier outpoints.depthfield on VTXOs — exposed via protobuf for clients to track chain position.idx_checkpoint_tx_offchain_txidoncheckpoint_tx(offchain_txid)to accelerate the bulk offchain tx query's join.Notes
pageSize— peak memory scales with full chain length. A process-level LRU cache keyed by marker ID could collapse duplicate work across pagination calls but is out of scope.idx_checkpoint_tx_offchain_txidadds a small write-path cost per checkpoint tx insert.Visual explainer
Background: the VTXO tree and
walkVtxoChainA VTXO chain is a tree of transactions.
walkVtxoChaindoes a BFS — it processes all VTXOs at one depth, discovers their children via checkpoint tx inputs, then moves to the next depth.Each VTXO is either on-chain (leaf of a batch tree) or preconfirmed (created by an offchain Ark tx). Preconfirmed VTXOs have a corresponding OffchainTx record containing checkpoint PSBTs whose inputs point to parent VTXOs — that's how the traversal discovers the next level.
graph TD subgraph "VTXO tree walkVtxoChain traverses" R["Round Commitment Tx (on-chain root)"] R --> A["VTXO A (depth 0)"] R --> B["VTXO B (depth 0)"] A --> C["VTXO C (preconfirmed, depth 1)"] A --> D["VTXO D (preconfirmed, depth 1)"] B --> E["VTXO E (preconfirmed, depth 1)"] C --> F["VTXO F (preconfirmed, depth 2)"] C --> G["VTXO G (preconfirmed, depth 2)"] D --> H["VTXO H (preconfirmed, depth 2)"] E --> I["VTXO I (preconfirmed, depth 2)"] E --> J["VTXO J (preconfirmed, depth 2)"] endMarker DAG: compressed traversal structure
Instead of visiting every VTXO individually, the marker DAG creates checkpoints every 100 depths.
preloadByMarkerswalks this small DAG to bulk-load all VTXOs and their offchain txs before the BFS loop starts.graph TD subgraph "Marker DAG (small, ~N/100 nodes)" M0["Marker M0 (depths 0-99)"] M1["Marker M1 (depths 100-199)"] M2["Marker M2 (depths 200-299)"] M0 --> M1 --> M2 end subgraph "VTXO tree (large, N nodes)" V0["~100 VTXOs (depths 0-99)"] V1["~100 VTXOs (depths 100-199)"] V2["~100 VTXOs (depths 200-299)"] end M0 -.->|covers| V0 M1 -.->|covers| V1 M2 -.->|covers| V2How preloadByMarkers works
The marker DAG walk bulk-fetches both VTXO records and offchain txs per marker window, so the BFS loop runs entirely from cache:
sequenceDiagram participant W as walkVtxoChain participant MDAG as Marker DAG walk participant DB as Database Note over W,MDAG: Before BFS loop starts W->>MDAG: preloadByMarkers(startVtxos) loop For each marker window (N/100 iterations) MDAG->>DB: GetVtxoChainByMarkers(window) DB-->>MDAG: ~100 VTXO records MDAG->>DB: GetOffchainTxsByTxids(preconfirmed txids) DB-->>MDAG: ~100 OffchainTx records MDAG->>DB: GetMarkersByIds(window) DB-->>MDAG: parent marker IDs end MDAG-->>W: vtxoCache + offchainTxCache fully populated Note over W,DB: BFS loop runs entirely from cache loop N iterations W->>W: vtxoCache lookup (hit) W->>W: offchainTxCache lookup (hit) endIn-loop bulk fetch fallback (for VTXOs without markers)
For wide fanout trees or VTXOs without marker coverage, the BFS loop itself collects missing offchain txids per iteration and batch-fetches them:
graph TD subgraph "BFS iteration: 3 preconfirmed VTXOs at depth 1" C["VTXO C - txid: abc1"] D["VTXO D - txid: abc2"] E["VTXO E - txid: abc3"] end subgraph "Old: 3 individual queries" Q1["SELECT ... WHERE txid = 'abc1'"] Q2["SELECT ... WHERE txid = 'abc2'"] Q3["SELECT ... WHERE txid = 'abc3'"] end subgraph "New: 1 bulk query" QB["SELECT ... WHERE txid IN ('abc1','abc2','abc3')"] end C --> Q1 D --> Q2 E --> Q3 C --> QB D --> QB E --> QBCombined effect (10k-entry linear chain)
graph TD subgraph "Before (no markers, no bulk fetch)" BA["VTXO fetches: ~10000 individual queries"] BB["OffchainTx fetches: ~10000 individual queries"] BC["Total: ~20000 round-trips"] BA --> BC BB --> BC end subgraph "After markers only (no offchain tx preload)" MA["VTXO fetches: ~100 bulk queries (markers)"] MB["OffchainTx fetches: ~10000 individual queries"] MC["Total: ~10100 round-trips"] MA --> MC MB --> MC end subgraph "After markers + offchain tx preload" FA["VTXO fetches: ~100 bulk queries (markers)"] FB["OffchainTx fetches: ~100 bulk queries (markers)"] FC["Total: ~200 round-trips"] FA --> FC FB --> FC endBenchmarks
Per-phase timing breakdown (10k-entry linear chain, 50µs simulated latency)
TestVtxoChainTimingBreakdown— in-process test with timing-decorated repos:All three repo phases (markers, VTXOs, offchain txs) make exactly 100 calls each — one per marker window. The remaining 145ms is PSBT deserialization and in-process overhead.
Call-counting test: 511-VTXO fanout tree
TestBulkOffchainTxReducesDBCalls— exercises the in-loop bulk fetch in isolation:~57x reduction in DB round-trips. The 9 bulk calls correspond to 9 BFS depth levels.
Latency benchmark: 511-VTXO fanout tree
BenchmarkOffchainTxBulkVsSingle— 50µs simulated DB latency:Test coverage
TestVtxoChainTimingBreakdown— per-phase wall-clock breakdown for a 10k linear chain with timing-decorated repos. No prod instrumentation needed.TestBulkOffchainTxReducesDBCalls— asserts bulk fetch eliminates individualGetOffchainTxcalls on a fanout tree.BenchmarkOffchainTxBulkVsSingle— wall-clock comparison with simulated latency.BenchmarkGetVtxoChain— linear chains (1k-5k), fanout trees, and diamond patterns with/without markers.DB schema changes
New tables
marker(id TEXT PK, depth INTEGER, parent_markers JSONB)— DAG checkpoints every 100 depthsswept_marker(marker_id TEXT PK FK→marker, swept_at BIGINT)— append-only sweep recordsModified tables
vtxo— addeddepth INTEGER NOT NULL DEFAULT 0,markers JSONB NOT NULL DEFAULT '[]'(GIN indexed); removedsweptcolumnNew indexes
idx_vtxo_markers— GIN index onvtxo.markersidx_marker_depth— B-tree onmarker.depthidx_marker_parent_markers— GIN onmarker.parent_markersidx_checkpoint_tx_offchain_txid— B-tree oncheckpoint_tx.offchain_txidView changes
vtxo_vwrebuilt to computesweptdynamically:EXISTS (SELECT 1 FROM swept_marker sm WHERE v.markers @> jsonb_build_array(sm.marker_id))Migration backfill
markerscolumnswept_markertablesweptcolumn fromvtxoAdditional Notes:
The
docker-compose.regtest.ymlwas updated such that thenbxplorerimage tag was bumped up.Summary by CodeRabbit
New Features
Improvements
Database