Conversation
…ByMarkers, getVtxosFromCacheOrDB
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@internal/core/application/indexer.go`:
- Around line 408-413: The loop that walks parent markers (using
marker.ParentMarkerIDs and i.repoManager.Markers().GetMarker) can infinite-loop
on cyclic parent chains; add a visited set (map[string]struct{}) keyed by marker
ID and check it before appending IDs or calling GetMarker to break cycles, and
also deduplicate ParentMarkerIDs when appending to markerIDs so you don't re-add
the same ID; update the loop to mark the current marker ID as visited, skip any
parent IDs already visited, and stop traversal if the next parent is seen.
In `@internal/infrastructure/db/badger/marker_repo.go`:
- Around line 501-514: GetVtxoChainByMarkers currently does a full table scan
via r.vtxoStore.Find(&dtos, &badgerhold.Query{}) and filters in-memory; change
it to query by marker IDs to avoid loading all vtxos: iterate markerIDs (or
batch them) and call r.vtxoStore.Find with badgerhold.Where("MarkerID").Eq(id)
for each id (or badgerhold.Where("MarkerID").In(batch) if supported), collect
matched vtxoDTOs, convert dto.Vtxo into the vtxos slice and return; ensure you
still respect markerIDSet and handle errors per-query.
In
`@internal/infrastructure/db/postgres/migration/20260210100000_add_vtxo_depth.up.sql`:
- Around line 8-13: The Postgres view vtxo_vw currently returns NULL for the
commitments column when no rows exist because it uses
string_agg(vc.commitment_txid, ','); change the SELECT to wrap string_agg with
COALESCE (e.g., COALESCE(string_agg(...), '')) so commitments always yields an
empty string like the SQLite view; update the SELECT that references vtxo,
vtxo_commitment_txid and the commitments alias to use
COALESCE(string_agg(vc.commitment_txid, ','), '').
In `@internal/infrastructure/db/service.go`:
- Around line 752-821: sweepVtxosWithMarkers currently marks a marker swept
before guaranteeing the marker's VTXOs were successfully removed, risking
inconsistent state; change the ordering so you attempt to sweep the marker's
VTXOs first (use markerStore.SweepVtxosByMarker and fall back to
vtxoStore.SweepVtxos for markerVtxos[markerID]) and only if that sweep returns
success call markerStore.SweepMarker(markerID, sweptAt); on any sweep error keep
the marker unmarked, log the failure, and accumulate the fallback count as now
done — update the loop in sweepVtxosWithMarkers to perform
SweepVtxosByMarker/SweepVtxos before calling SweepMarker and adjust error
handling accordingly.
🧹 Nitpick comments (12)
internal/test/e2e/utils_test.go (1)
742-744: Acknowledge the TODO placeholder.The TODO is clear about the dependency on the SDK proto package exposing
Depth. Consider tracking this with a GitHub issue so it doesn't get lost.Would you like me to open a GitHub issue to track re-enabling
setupRawIndexerClientandgetVtxoDepthByOutpointonce the SDK proto exposesDepth?internal/core/domain/marker_test.go (2)
9-35: Good coverage of boundary cases.The table-driven test covers a solid range including edges (0, 99, 100, 101). Consider using
t.Runwith a subtest name for each case to get more granular test output on failure.♻️ Optional: use subtests for better diagnostics
for _, tt := range tests { + t.Run(fmt.Sprintf("depth_%d", tt.depth), func(t *testing.T) { result := IsAtMarkerBoundary(tt.depth) require.Equal(t, tt.expected, result, "IsAtMarkerBoundary(%d) should be %v", tt.depth, tt.expected) + }) }(You'd need to add
"fmt"to imports.)
42-55: These tests only verify struct literal construction, not behavior.
TestMarkerStructandTestSweptMarkerStructtest that Go struct fields hold the values you assign — they don't test any domain logic. They're fine as documentation of the data model but provide no regression protection. Consider adding tests for actual marker operations (creation, parent resolution, etc.) as the marker logic matures.internal/infrastructure/db/postgres/migration/20260211020000_add_markers.up.sql (1)
2-6: Consider addingNOT NULL DEFAULT '[]'::jsonbtoparent_markers.The column currently allows NULL, which means application code must handle both NULL and empty array. Using a NOT NULL default simplifies queries and Go code that deserializes this field.
♻️ Suggested change
CREATE TABLE IF NOT EXISTS marker ( id TEXT PRIMARY KEY, depth INTEGER NOT NULL, - parent_markers JSONB -- JSON array of parent marker IDs + parent_markers JSONB NOT NULL DEFAULT '[]'::jsonb -- JSON array of parent marker IDs );internal/core/domain/marker_repo.go (2)
5-44: Large interface — consider whether it could be split, and watch for unbounded queries.The interface has 16 methods mixing marker lifecycle, sweep operations, and VTXO queries. This is functional but may violate the Interface Segregation Principle as it grows.
More concretely, methods like
GetVtxosByMarker,GetVtxosByDepthRange, andGetVtxoChainByMarkers(lines 30, 37, 41) return unbounded[]Vtxoslices. If marker/depth ranges can span many VTXOs, callers may hit memory pressure. Consider whether pagination or a limit parameter is warranted for these, especiallyGetVtxosByDepthRangewhich could span a very wide range.
6-7: Clarify upsert semantics in the doc comment.The comment says "creates or updates a marker," but the method signature uses
erroras the only signal. It may be useful to document whether an update replacesParentMarkerIDsentirely or merges, and whether updating a marker that has already been swept is allowed.internal/infrastructure/db/sqlite/migration/20260211000000_add_markers.up.sql (1)
2-6: Consider adding an index onparent_markersfor BFS descendant lookups.The
markertable storesparent_markersas a JSON text column. The Badger implementation does BFS by querying markers whoseParentMarkerIDscontains a given ID. If a similar query pattern is used in SQLite (e.g., usingjson_eachto find children), performance could degrade without an index strategy. This is fine for now if the query load is low, but worth keeping in mind.internal/infrastructure/db/badger/marker_repo.go (2)
42-106: Constructor usesinterface{}variadic config — consider a typed options struct.The
NewMarkerRepository(config ...interface{})pattern with positionalinterface{}arguments is fragile and hard to use correctly. While this matches the existing codebase pattern (e.g.,NewVtxoRepository), a typed config struct would be safer. This is fine for now if consistency with the existing pattern is preferred.
116-136: Retry loop doesn't respect context cancellation.The retry loops (here and in similar patterns at lines 243, 390, 435) sleep unconditionally without checking
ctx.Done(). If the context is cancelled, the function will still retry up tomaxRetriestimes with 100ms sleeps. This is a minor concern given the small retry count.internal/infrastructure/db/sqlite/marker_repo.go (1)
141-166: Make descendant sweeping atomic to avoid partial state.If an insert fails mid-loop, some markers are swept and others aren’t. Wrapping the inserts in a single transaction avoids partial sweeps and reduces round-trips.
♻️ Suggested transaction wrapper
func (m *markerRepository) SweepMarkerWithDescendants( ctx context.Context, markerID string, sweptAt int64, ) (int64, error) { // Get all descendant marker IDs (including the root marker) that are not already swept descendantIDs, err := m.querier.GetDescendantMarkerIds(ctx, markerID) if err != nil { return 0, fmt.Errorf("failed to get descendant markers: %w", err) } - // Insert each descendant into swept_marker - var count int64 - for _, id := range descendantIDs { - err := m.querier.InsertSweptMarker(ctx, queries.InsertSweptMarkerParams{ - MarkerID: id, - SweptAt: sweptAt, - }) - if err != nil { - return count, fmt.Errorf("failed to sweep marker %s: %w", id, err) - } - count++ - } - - return count, nil + tx, err := m.db.BeginTx(ctx, nil) + if err != nil { + return 0, err + } + q := queries.New(tx) + defer tx.Rollback() + + var count int64 + for _, id := range descendantIDs { + if err := q.InsertSweptMarker(ctx, queries.InsertSweptMarkerParams{ + MarkerID: id, + SweptAt: sweptAt, + }); err != nil { + return count, fmt.Errorf("failed to sweep marker %s: %w", id, err) + } + count++ + } + if err := tx.Commit(); err != nil { + return 0, err + } + return count, nil }internal/infrastructure/db/postgres/marker_repo.go (1)
144-169: Make descendant sweeping atomic to avoid partial state.Same concern as the sqlite implementation—if the loop fails mid-way, markers can be partially swept.
♻️ Suggested transaction wrapper
func (m *markerRepository) SweepMarkerWithDescendants( ctx context.Context, markerID string, sweptAt int64, ) (int64, error) { // Get all descendant marker IDs (including the root marker) that are not already swept descendantIDs, err := m.querier.GetDescendantMarkerIds(ctx, markerID) if err != nil { return 0, fmt.Errorf("failed to get descendant markers: %w", err) } - // Insert each descendant into swept_marker - var count int64 - for _, id := range descendantIDs { - err := m.querier.InsertSweptMarker(ctx, queries.InsertSweptMarkerParams{ - MarkerID: id, - SweptAt: sweptAt, - }) - if err != nil { - return count, fmt.Errorf("failed to sweep marker %s: %w", id, err) - } - count++ - } - - return count, nil + tx, err := m.db.BeginTx(ctx, nil) + if err != nil { + return 0, err + } + q := queries.New(tx) + defer tx.Rollback() + + var count int64 + for _, id := range descendantIDs { + if err := q.InsertSweptMarker(ctx, queries.InsertSweptMarkerParams{ + MarkerID: id, + SweptAt: sweptAt, + }); err != nil { + return count, fmt.Errorf("failed to sweep marker %s: %w", id, err) + } + count++ + } + if err := tx.Commit(); err != nil { + return 0, err + } + return count, nil }internal/infrastructure/db/sqlite/sqlc/query.sql (1)
467-480: Replace LIKE-based JSON matching with json_each() for robustness.The recursive CTE uses
m.parent_markers LIKE '%"' || dm.id || '"%'to check if a marker ID exists in a JSON array. While this works in practice, it's fragile: marker IDs containing SQL LIKE wildcards (%,_) would cause incorrect matches since the code doesn't escape them. The Postgres version correctly uses@> jsonb_build_array(dm.id)(line 473); SQLite should use the equivalentjson_each()for consistency and correctness:Suggested replacement
- SELECT m.id FROM marker m - INNER JOIN descendant_markers dm ON ( - m.parent_markers LIKE '%"' || dm.id || '"%' - ) + SELECT m.id FROM marker m + INNER JOIN descendant_markers dm ON EXISTS ( + SELECT 1 FROM json_each(m.parent_markers) je WHERE je.value = dm.id + )
| CREATE VIEW vtxo_vw AS | ||
| SELECT v.*, string_agg(vc.commitment_txid, ',') AS commitments | ||
| FROM vtxo v | ||
| LEFT JOIN vtxo_commitment_txid vc | ||
| ON v.txid = vc.vtxo_txid AND v.vout = vc.vtxo_vout | ||
| GROUP BY v.txid, v.vout; |
There was a problem hiding this comment.
Inconsistent NULL handling between Postgres and SQLite views.
The SQLite vtxo_vw uses COALESCE(group_concat(...), '') (returns empty string when no commitments), but this Postgres view uses bare string_agg(...) (returns NULL when no commitments). This can cause behavioral differences across backends if downstream code doesn't uniformly handle both NULL and empty string.
Consider wrapping with COALESCE for consistency:
Suggested fix
CREATE VIEW vtxo_vw AS
-SELECT v.*, string_agg(vc.commitment_txid, ',') AS commitments
+SELECT v.*, COALESCE(string_agg(vc.commitment_txid, ','), '') AS commitments
FROM vtxo v
LEFT JOIN vtxo_commitment_txid vc
ON v.txid = vc.vtxo_txid AND v.vout = vc.vtxo_vout
GROUP BY v.txid, v.vout;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| CREATE VIEW vtxo_vw AS | |
| SELECT v.*, string_agg(vc.commitment_txid, ',') AS commitments | |
| FROM vtxo v | |
| LEFT JOIN vtxo_commitment_txid vc | |
| ON v.txid = vc.vtxo_txid AND v.vout = vc.vtxo_vout | |
| GROUP BY v.txid, v.vout; | |
| CREATE VIEW vtxo_vw AS | |
| SELECT v.*, COALESCE(string_agg(vc.commitment_txid, ','), '') AS commitments | |
| FROM vtxo v | |
| LEFT JOIN vtxo_commitment_txid vc | |
| ON v.txid = vc.vtxo_txid AND v.vout = vc.vtxo_vout | |
| GROUP BY v.txid, v.vout; |
🤖 Prompt for AI Agents
In
`@internal/infrastructure/db/postgres/migration/20260210100000_add_vtxo_depth.up.sql`
around lines 8 - 13, The Postgres view vtxo_vw currently returns NULL for the
commitments column when no rows exist because it uses
string_agg(vc.commitment_txid, ','); change the SELECT to wrap string_agg with
COALESCE (e.g., COALESCE(string_agg(...), '')) so commitments always yields an
empty string like the SQLite view; update the SELECT that references vtxo,
vtxo_commitment_txid and the commitments alias to use
COALESCE(string_agg(vc.commitment_txid, ','), '').
… to use BulkSweepMarkers
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
internal/infrastructure/db/badger/vtxo_repo.go (2)
101-119:⚠️ Potential issue | 🟠 Major
GetVtxosreturnsnil, nilwhen any single outpoint is missing, discarding all previously collected vtxos.
getVtxoreturns(nil, nil)forErrNotFound(Line 476-477), so thestrings.Contains(err.Error(), "not found")check on Line 108 is dead code for that path. Instead, execution falls through to Line 113 wherevtxo == niltriggersreturn nil, nil, silently dropping all vtxos already appended. This shouldcontinueinstead, matching the likely intent of skipping missing outpoints.Proposed fix
for _, outpoint := range outpoints { vtxo, err := r.getVtxo(ctx, outpoint) if err != nil { - if strings.Contains(err.Error(), "not found") { - continue - } return nil, err } if vtxo == nil { - return nil, nil + continue } vtxos = append(vtxos, *vtxo) }
277-322:⚠️ Potential issue | 🟡 MinorInconsistent filter: query uses
Ge(>=) but post-filter uses>(strictly greater).Lines 286-288 and 295-297 fetch vtxos with
Amount >= amountFilter, but Lines 306 and 311 then exclude vtxos whereAmount == amountFilterby checkingvtxo.Amount > amountFilter. This means vtxos with amount exactly equal to the filter are fetched from the DB but silently dropped. Either the query should useGtor the post-filter should use>=.
🤖 Fix all issues with AI agents
In `@internal/core/application/service_test.go`:
- Around line 149-154: The test case "no spent vtxos" expects depth 1 but the
service's logic leaves newDepth at 0 when spentOutpoints is empty; update the
test in service_test.go for the case named "no spent vtxos (empty)" to set
expectedDepth to 0 so it matches the actual behavior of the service (referencing
newDepth, maxDepth and spentOutpoints in the service implementation).
In `@internal/infrastructure/db/badger/marker_repo.go`:
- Around line 258-276: SweepMarker currently does a full table scan by calling
r.vtxoStore.Find(&allDtos, &badgerhold.Query{}) for every marker (and
BulkSweepMarkers calls SweepMarker in a loop), causing N full scans; change
SweepMarker to query only VTXOs that contain the marker by using
r.vtxoStore.Find(&filteredDtos,
badgerhold.Where("MarkerIDs").Contains(markerID)) (same pattern as
getDescendantMarkerIds), iterate filteredDtos (type vtxoDTO) and call
r.vtxoStore.Update(outpoint.String(), dto) to set Swept=true and UpdatedAt; this
ensures each marker triggers a targeted query instead of scanning all VTXOs and
avoids the N×full-scan behavior in BulkSweepMarkers.
- Around line 438-462: GetVtxosByMarker currently loads all VTXOs then filters
in memory; change the find to use an indexed query so Badger filters by
MarkerIDs: replace the badgerhold.Query{} call in
markerRepository.GetVtxosByMarker with
badgerhold.Where("MarkerIDs").Contains(markerID) (keeping the same
r.vtxoStore.Find(&dtos, query) pattern), then retain the existing loop to
compute vtxo.Swept via r.isAnyMarkerSwept(dto.MarkerIDs) and append matching
DTOs to the result slice.
In `@internal/infrastructure/db/postgres/marker_repo.go`:
- Around line 159-184: SweepMarkerWithDescendants does inserts in a loop without
a transaction, causing partial commits on failure; wrap the entire operation in
a DB transaction so either all descendant InsertSweptMarker calls succeed or
none do. Start a transaction (e.g., via m.db.BeginTx or your repo's transaction
helper), run GetDescendantMarkerIds and then perform each
queries.InsertSweptMarker using the transactional querier/context (or passing tx
into the querier methods), rollback on any error and commit at the end, and
return the count only after a successful commit; reference functions:
SweepMarkerWithDescendants, GetDescendantMarkerIds, and InsertSweptMarker.
In
`@internal/infrastructure/db/postgres/migration/20260210100000_add_depth_and_markers.down.sql`:
- Around line 1-14: Move the view drops to before any column/table drops: drop
views intent_with_inputs_vw and vtxo_vw first, then drop index idx_vtxo_markers,
drop columns markers and depth from table vtxo, and finally drop marker and
swept_marker tables; update the script so vtxo_vw and intent_with_inputs_vw are
removed prior to altering vtxo to avoid PostgreSQL dependency errors.
In
`@internal/infrastructure/db/postgres/migration/20260210100000_add_depth_and_markers.up.sql`:
- Around line 1-5: Change the new column definition for markers on table vtxo to
be non-nullable with a default empty JSON array by altering the ADD COLUMN
statement for markers to "ADD COLUMN IF NOT EXISTS markers JSONB NOT NULL
DEFAULT '[]'::jsonb" (keep the existing GIN index creation), and ensure any
separate backfill step that populates markers for existing rows is consistent
with this default (remove/adjust redundant backfill or ensure it uses
'[]'::jsonb for rows without markers).
In `@internal/infrastructure/db/postgres/sqlc/queries/query.sql.go`:
- Around line 1770-1776: The SQL in the constant selectVtxosByArkTxid used by
the method SelectVtxosByArkTxid filters on the wrong column (txid); update the
query string to use WHERE ark_txid = $1 (or WHERE ark_txid = `@ark_txid` in the
.sql source) so the function returns VTXOs created by the given ark transaction;
update the selectVtxosByArkTxid SQL in both the Postgres and SQLite query.sql
sources so the generated query and the Queries.SelectVtxosByArkTxid
implementation both filter on ark_txid instead of txid.
- Around line 118-131: The recursive CTE used by GetDescendantMarkerIds scans
marker.parent_markers with the jsonb containment operator (@>), causing repeated
sequential scans; add a migration that creates a GIN index on the parent_markers
column (marker.parent_markers) so the recurrence m.parent_markers @>
jsonb_build_array(dm.id) can use the index; implement the migration file that
executes CREATE INDEX IF NOT EXISTS idx_marker_parent_markers ON marker USING
GIN (parent_markers) and ensure it is applied in your migrations pipeline.
In `@internal/infrastructure/db/service.go`:
- Around line 210-219: The code appends badgerVtxoRepo.GetStore() onto
config.DataStoreConfig which can mutate the original slice's backing array;
instead create a new slice copy of config.DataStoreConfig before appending to
avoid side effects. Locate the block that builds markerConfig (using
config.DataStoreConfig, badgerVtxoRepo.GetStore() and markerStoreFactory) and
replace the direct append with creating a new slice sized to hold the elements,
copying config.DataStoreConfig into it, then append badgerVtxoRepo.GetStore() to
that new slice and pass the new slice to markerStoreFactory.
- Around line 492-496: CreateRootMarkersForVtxos failures are currently only
warned and can leave persisted VTXOs referencing missing markers; update the
block where s.markerStore.CreateRootMarkersForVtxos(ctx, newVtxos) is called to
either (a) retry the CreateRootMarkersForVtxos call with the same retry/backoff
strategy used for persisting VTXOs (mirror the loop around the VTXO
persistence), or (b) if retrying fails, return the error to fail-fast so the
caller can roll back or handle incomplete state; locate the call to
s.markerStore.CreateRootMarkersForVtxos and implement a retry loop (or propagate
the error) and ensure logs include context about the affected VTXO set when
giving up.
- Around line 538-577: The GetVtxos DB failure path leaves newDepth at 0 and
parentMarkerIDs nil which makes IsAtMarkerBoundary(0) treat chained VTXOs as
root markers; update the error path in the block that calls s.vtxoStore.GetVtxos
(inside the loop over offchainTx.CheckpointTxs and subsequent processing) to
avoid creating misleading root markers by either returning the error upward
(propagate the GetVtxos error) or setting newDepth to a sentinel (e.g., a
special unknown value) and ensuring downstream logic
(IsAtMarkerBoundary/newDepth handling) treats that sentinel as “unknown” (no
root marker creation) instead of depth 0, and document the chosen approach in
the same function where newDepth and parentMarkerIDs are computed.
In
`@internal/infrastructure/db/sqlite/migration/20260210000000_add_depth_and_markers.down.sql`:
- Around line 28-33: The down migration tries to copy the removed swept column
from vtxo into vtxo_temp causing "no such column: swept"; fix by reconstructing
swept or restoring the column before the INSERT: either add a temporary swept
column to vtxo (or vtxo_temp) prior to the INSERT (so INSERT INTO vtxo_temp
SELECT ... swept ... FROM vtxo succeeds), or change the INSERT SELECT to compute
swept from swept_marker (join vtxo with swept_marker and derive swept) so the
SELECT no longer references the missing swept column; look for symbols
vtxo_temp, vtxo, swept, swept_marker and vtxo_new when applying the change.
In `@internal/infrastructure/db/sqlite/sqlc/query.sql`:
- Around line 471-484: GetDescendantMarkerIds currently matches parent_markers
via m.parent_markers LIKE '%"' || dm.id || '"%' which is brittle (false
positives for '%'/'_' and overlapping prefixes) and forces full scans; replace
the LIKE with a JSON-aware check using SQLite's json_each (e.g., JOIN/EXISTS
over json_each(m.parent_markers) j WHERE j.value = dm.id) or, better, migrate
parent_markers to a normalized join table (parent_marker mapping) and update
descendant_markers to join that table; also add an integration test for
GetDescendantMarkerIds using marker IDs containing characters like '%'/'_' and
overlapping prefixes to ensure correctness, and document the current limitation
of the LIKE approach in the schema/query comments.
🧹 Nitpick comments (20)
internal/core/application/sweeper_test.go (2)
734-735: Non-obvious Txid values fori >= 26.
string(rune('A'+i))foriin0..49produces ASCII letters A–Z fori < 26, but non-letter characters ([,\,], …) fori >= 26. This doesn't break the test (uniqueness is preserved), butfmt.Sprintf("child-%d", i)would be clearer and consistent withTestCreateCheckpointSweepTask_LargeMarkerSet(line 1189).Suggested fix
- childOutpoints[i] = domain.Outpoint{Txid: "child" + string(rune('A'+i)), VOut: 0} + childOutpoints[i] = domain.Outpoint{Txid: fmt.Sprintf("child-%d", i), VOut: 0}
22-158: Consider generating mocks to reduce boilerplate.~400 lines of hand-rolled mocks for
WalletService,VtxoRepository,MarkerRepository, andTxBuilder. Most methods are stubs returning zero values. Using a tool likemockeryorcounterfeiterwould auto-generate these, reduce maintenance burden as interfaces evolve, and keep the test file focused on test logic.internal/core/domain/marker_repo.go (1)
41-47: VTXO retrieval methods on MarkerRepository blur the boundary with VtxoRepository.
GetVtxosByDepthRange,GetVtxosByArkTxid, andGetVtxoChainByMarkersreturn[]Vtxoand are essentially VTXO queries. Placing them here is understandable since they're marker/depth-optimized, but it means callers now need to know which repository to ask for VTXOs depending on the query pattern. If the interface continues to grow, consider whether a dedicated chain-traversal service or moving these toVtxoRepositorywith marker-aware implementations would keep the boundaries cleaner.internal/infrastructure/db/sqlite/sqlc/query.sql (1)
261-270: Liquidity queries now scan every vtxo row with a correlated LIKE subquery.
SelectExpiringLiquidityAmountandSelectRecoverableLiquidityAmountboth useEXISTS (SELECT 1 FROM swept_marker sm WHERE v.markers LIKE '%"' || sm.marker_id || '"%'). This is essentially a cross join betweenvtxoandswept_markerwith a LIKE predicate on every pair —O(vtxos × swept_markers)string scans per query. As the number of swept markers grows, these queries will degrade.Consider caching swept status on the vtxo row itself (a denormalized
sweptflag updated during BulkSweepMarkers), or evaluating sweep status in the application layer where the marker set is already available.Also applies to: 273-279
internal/infrastructure/db/postgres/sqlc/query.sql (1)
500-514: Inconsistent projection:SELECT *vssqlc.embed(vtxo_vw)across vtxo queries.
SelectVtxosByDepthRange,SelectVtxosByArkTxid, andSelectVtxoChainByMarkeruseSELECT * FROM vtxo_vw, while all other vtxo queries (e.g.,SelectAllVtxos,SelectVtxo,SelectSweepableUnrolledVtxos) useSELECT sqlc.embed(vtxo_vw) FROM vtxo_vw. This generates different Go return types — flat structs vs. nestedstruct { VtxoVw VtxoVw }— requiring different mapping code in the repository layer.Consider using
sqlc.embed(vtxo_vw)consistently so the generated Go types are uniform.Suggested fix
-- name: SelectVtxosByDepthRange :many -- Get all VTXOs within a depth range, useful for filling gaps between markers -SELECT * FROM vtxo_vw +SELECT sqlc.embed(vtxo_vw) FROM vtxo_vw WHERE depth >= `@min_depth` AND depth <= `@max_depth` ORDER BY depth DESC; -- name: SelectVtxosByArkTxid :many -- Get all VTXOs created by a specific ark tx (offchain tx) -SELECT * FROM vtxo_vw WHERE txid = `@ark_txid`; +SELECT sqlc.embed(vtxo_vw) FROM vtxo_vw WHERE txid = `@ark_txid`; -- name: SelectVtxoChainByMarker :many -- Get VTXOs whose markers JSONB array contains any of the given marker IDs -SELECT * FROM vtxo_vw +SELECT sqlc.embed(vtxo_vw) FROM vtxo_vw WHERE markers ?| `@marker_ids`::TEXT[] ORDER BY depth DESC;internal/infrastructure/db/sqlite/vtxo_repo.go (1)
538-548: Silent error swallowing inparseMarkersJSONFromVtxocould mask data corruption.If the JSON in the
markerscolumn is malformed, this function silently returnsnilwithout any logging. While defensive, this could make it hard to diagnose data integrity issues.Consider adding a log warning on parse failure, consistent with how other parse errors are handled elsewhere in the codebase.
Optional: add warning log on parse failure
func parseMarkersJSONFromVtxo(markersJSON string) []string { if markersJSON == "" { return nil } var markerIDs []string if err := json.Unmarshal([]byte(markersJSON), &markerIDs); err != nil { + // Log warning to help diagnose data corruption + log.WithError(err).Warn("failed to parse markers JSON from vtxo") return nil } return markerIDs }internal/core/application/indexer_test.go (1)
15-299: Consider using a mock generation tool to reduce boilerplate.The manual mock implementations (~280 lines of stubs) are correct but add significant maintenance burden. Tools like
mockeryormoqcould auto-generate these from the interfaces and keep them in sync as the repository interfaces evolve.That said, the explicit nil-interface handling in
Markers()(lines 288–294) is a valuable pattern worth keeping regardless.internal/infrastructure/db/badger/marker_repo.go (1)
524-528: Dead error handling — assign-then-discard pattern is misleading.Line 525 calls
r.SweepMarker(...)and assigns toerr, then line 528 discards it with_ = err. This is confusing — use_ =directly on the call.Simplify the error discard
- if err := r.SweepMarker(ctx, markerID, time.Now().Unix()); err != nil { - // Non-fatal - the vtxos are already marked as swept - _ = err - } + // Non-fatal - the vtxos are already marked as swept + _ = r.SweepMarker(ctx, markerID, time.Now().Unix())internal/infrastructure/db/sqlite/marker_repo.go (2)
398-508: Four nearly identicalrowToVtxoFrom*functions — consider a shared mapper.
rowToVtxoFromMarkerQuery,rowToVtxoFromDepthRangeQuery,rowToVtxoFromArkTxidQuery, androwToVtxoFromChainQueryall perform the same mapping fromVtxoVwembedded in different sqlc row types. Since the innerrow.VtxoVwis the same type (queries.VtxoVw), you could extract a sharedvtxoVwToDomain(vw queries.VtxoVw) domain.Vtxoand call it from each wrapper, reducing ~100 lines of duplication.Note that
vtxo_repo.goalready hasrowToVtxo(row queries.VtxoVw)which does essentially the same mapping — you could reuse that directly.Consolidate using the existing rowToVtxo from vtxo_repo.go
func rowToVtxoFromMarkerQuery(row queries.SelectVtxosByMarkerIdRow) domain.Vtxo { - var commitmentTxids []string - if commitments, ok := row.VtxoVw.Commitments.(string); ok && commitments != "" { - commitmentTxids = strings.Split(commitments, ",") - } - return domain.Vtxo{ - Outpoint: domain.Outpoint{ - Txid: row.VtxoVw.Txid, - VOut: uint32(row.VtxoVw.Vout), - }, - // ... all fields ... - } + return rowToVtxo(row.VtxoVw) }Apply the same pattern to all four functions.
510-519: DuplicateparseMarkersJSON— already exists asparseMarkersJSONFromVtxoinvtxo_repo.go.Both functions in this package have identical logic. Consolidate into a single shared function.
internal/core/application/indexer.go (2)
416-431: Consider batchingGetMarkercalls to reduce DB round-trips during BFS.Each iteration of the BFS loop issues an individual
GetMarkerDB call (line 420). For deep marker chains (e.g., depth 20000 with markers every 100 levels = ~200 markers), this results in ~200 sequential queries. A batch approach usingGetMarkersByIdson the current queue batch would be significantly faster.♻️ Sketch of batched BFS
for len(queue) > 0 { - currentID := queue[0] - queue = queue[1:] - - marker, err := i.repoManager.Markers().GetMarker(ctx, currentID) - if err != nil || marker == nil { + // Fetch all markers in current queue batch at once + batch := queue + queue = nil + markers, err := i.repoManager.Markers().GetMarkersByIds(ctx, batch) + if err != nil { continue } - - for _, parentID := range marker.ParentMarkerIDs { - if !visited[parentID] { - visited[parentID] = true - markerIDs = append(markerIDs, parentID) - queue = append(queue, parentID) + for _, marker := range markers { + for _, parentID := range marker.ParentMarkerIDs { + if !visited[parentID] { + visited[parentID] = true + markerIDs = append(markerIDs, parentID) + queue = append(queue, parentID) + } } } }
465-474: Cache is mutated via thecachemap parameter — document this side effect.
getVtxosFromCacheOrDBupdates the caller's map in-place (line 473). This is correct for the current usage pattern, but the mutation is non-obvious. A brief doc note on the side effect would improve maintainability.internal/infrastructure/db/postgres/migration/20260210100000_add_depth_and_markers.up.sql (2)
73-83: Correlated EXISTS subquery in the view may degrade asswept_markergrows.The
vtxo_vwview computessweptviaEXISTS (SELECT 1 FROM swept_marker sm WHERE v.markers @> jsonb_build_array(sm.marker_id)). This scansswept_markerfor each VTXO row. While the GIN index onmarkershelps the containment check, this is effectively a semi-join where the outer side (swept_marker) is iterated per-vtxo. As the number of swept markers grows, this scan may become expensive for queries that touch many VTXOs.Consider whether a reverse lookup (joining vtxo markers against
swept_markerPK) or a materialized approach would scale better for your expected data volumes.
25-30: Remove intermediate view creation — they are dropped and recreated without ever being used.The views created at lines 25-30 and 32-40 are dropped at lines 65-66 before being recreated at lines 73-93. The backfill queries (lines 44-62) query
vtxodirectly, so these intermediate views are never referenced and can be removed to simplify the migration.♻️ Simplified migration flow
-- Drop views before dropping the swept column (views depend on it via v.*) -DROP VIEW IF EXISTS intent_with_inputs_vw; -DROP VIEW IF EXISTS vtxo_vw; - -CREATE VIEW vtxo_vw AS -SELECT v.*, string_agg(vc.commitment_txid, ',') AS commitments -FROM vtxo v -LEFT JOIN vtxo_commitment_txid vc -ON v.txid = vc.vtxo_txid AND v.vout = vc.vtxo_vout -GROUP BY v.txid, v.vout; - -CREATE VIEW intent_with_inputs_vw AS -SELECT vtxo_vw.*, - intent.id, - intent.round_id, - intent.proof, - intent.message -FROM intent -LEFT OUTER JOIN vtxo_vw -ON intent.id = vtxo_vw.intent_id; - -- Backfill: Create a marker for every existing VTXO using its outpoint as marker IDinternal/infrastructure/db/sqlite/migration/20260210000000_add_depth_and_markers.up.sql (1)
22-41: Intermediate view recreation appears unused — backfill queries referencevtxodirectly.Same as the Postgres migration: the views created at lines 26-41 are dropped again at lines 96-97 without being referenced by the backfill statements (lines 45-62). They add migration complexity without benefit.
internal/infrastructure/db/postgres/marker_repo.go (2)
249-274: TOCTOU between count query and sweep insert inSweepVtxosByMarker.
CountUnsweptVtxosByMarkerId(line 260) andInsertSweptMarker(line 266) are not atomic. The returned count may not reflect the actual number of VTXOs affected by the sweep. Since the count is only used for logging/metrics, this isn't a correctness issue, but worth noting.
426-436: Silent error swallowing inparseMarkersJSONB— consider logging.Unmarshal errors at line 432 are silently swallowed. If corrupted marker JSON ends up in the database, this would silently produce nil marker IDs, making affected VTXOs invisible to marker-based queries. A debug-level log would aid troubleshooting without adding noise.
internal/core/application/service_test.go (1)
562-567:outputs[0].MarkerIDsis re-sorted on every loop iteration.The sort at line 563 mutates
outputs[0].MarkerIDsin-place on each iteration. Move it before the loop.♻️ Minor optimization
// All outputs must have the same marker IDs + sort.Strings(outputs[0].MarkerIDs) for i := 1; i < len(outputs); i++ { - sort.Strings(outputs[0].MarkerIDs) sort.Strings(outputs[i].MarkerIDs) require.Equal(t, outputs[0].MarkerIDs, outputs[i].MarkerIDs, "output %d has different markers than output 0", i) }internal/infrastructure/db/badger/vtxo_repo.go (2)
23-26: Duplicate accessors:GetStore()andStore()return the same value.Both methods on Lines 23-26 and Lines 421-424 return
r.storewith identical signatures. Pick one and remove the other to avoid confusion about which to call.Also applies to: 421-424
625-637: Redundant visited check inGetSweepableVtxosByCommitmentTxid.Line 627 checks
!visited[outpointKey], and Line 628 checks!seenon the same key. Sincevisitedmaps tobool,!visited[key]is true iff the key is absent (zero-valuefalse), making the inner check always true when reached. This also means Line 633-635 (enqueueArkTxid) is unreachable for already-visited outpoints — which is correct — but the double-check is confusing. Compare with the cleaner pattern inGetAllChildrenVtxos(Lines 668-676).Simplify to match GetAllChildrenVtxos pattern
for _, vtxo := range vtxos { outpointKey := vtxo.Outpoint.String() - if !visited[outpointKey] { - if _, seen := visited[outpointKey]; !seen { - visited[outpointKey] = true - outpoints = append(outpoints, vtxo.Outpoint) - } - - if vtxo.ArkTxid != "" { - queue = append(queue, vtxo.ArkTxid) - } + if !visited[outpointKey] { + visited[outpointKey] = true + outpoints = append(outpoints, vtxo.Outpoint) + if vtxo.ArkTxid != "" { + queue = append(queue, vtxo.ArkTxid) + } } }
internal/infrastructure/db/postgres/migration/20260210100000_add_depth_and_markers.down.sql
Outdated
Show resolved
Hide resolved
internal/infrastructure/db/sqlite/migration/20260210000000_add_depth_and_markers.down.sql
Outdated
Show resolved
Hide resolved
# Conflicts: # internal/infrastructure/db/sqlite/sqlc/queries/query.sql.go
| vtxoCache := make(map[string]domain.Vtxo) | ||
| loadedMarkers := make(map[string]bool) | ||
|
|
||
| for len(nextVtxos) > 0 { |
There was a problem hiding this comment.
Besides pagination improvements, we must get rid of this loop. It is not scalable and has already caused memory issues. I noticed you added the GetVtxoChainByMarkers method, shouldn't we leverage it?
There was a problem hiding this comment.
I think youre right but this may be better suited as a follow-up PR given the scope of the change.
- The loop doesn't just fetch VTXOs — it also fetches offchain transactions, decodes checkpoint PSBTs, builds ChainTx with spends/types, and handles the batch tree branch case
- Pagination would need to work completely differently with marker-based fetching
- The response ordering (chain from leaf to root) would need to be reconstructed from a flat set of marker-fetched VTXOs
This PR adds the marker infrastructure which is a prerequisite for this optimization which I would say we should do as a follow-up PR
There was a problem hiding this comment.
+1 for another PR, easier to review too. But we need to do it before merging this one.
We have performance issue using that loop. for a 5k tx chain, it takes 15s on master to respond. With this branch it's 12s thanks to cursor improvements, which is not significant.
we need to know if the marker really improve response time to client (I was expecting less than 3s for 5k txs chain). Just to confirm it's the "right" change for this vtxo chain getter
There was a problem hiding this comment.
the loop wasn't removed but I believe the PR captures the spirit of the ask which is removing the bottleneck (all the DB round trip calls). It changes the DB access pattern from O(chain_length) individual GetVtxos calls to O(chain_length / MarkerInterval) bulk fetches. The VTXOs are prefetched into vtxoCache and the loop reads from the cache instead of hitting the DB each iteration.
|
|
||
| // MarkerInterval is the depth interval at which markers are created. | ||
| // VTXOs at depth 0, 100, 200, etc. create new markers. | ||
| const MarkerInterval = 100 |
There was a problem hiding this comment.
what's happening to prod if we change this value ? like we have a DB with MarkerInterval 100 and we migrate to 150, do we need a specific migration for this ?
There was a problem hiding this comment.
that would be problematic. newMarker would would create and skip at wrong intervals relative to the existing data, and parent marker linking would be inconsistent. I don't think we could have a migration to fix this, unless the migration is simply to rebuild the DAG if the constant is changed (pain). We need to treat this value as a "please do not change" constant. I could document that accordingly.
🔍 Arkana PR ReviewMajor feature: Scale the DAG — VTXO chain pagination, depth tracking, indexer caching SummaryLarge PR (~13.6k lines) that adds:
Security Analysis
✅ Bounds on pagination:
✅ Visited set prevents infinite loops:
✅ Cache is request-scoped:
Observations
Cross-repo impactSDKs consuming the indexer API will see the new VerdictWell-engineered scalability improvement. The cursor-based pagination and caching are important for large VTXO chains. The page token input surface is minimal risk. Recommend careful review of the BFS termination logic to ensure no edge case drops VTXOs during pagination boundaries. |
There was a problem hiding this comment.
🔍 Arkana Review — arkd#908 (Scale the DAG)
Summary: Introduces a marker-based DAG checkpoint system for VTXO chains. Markers are created every 100 depths, enabling bulk sweep operations (1 INSERT per 100 VTXOs vs 1 UPDATE per VTXO) and prefetch-based chain traversal via ensureVtxosCached. Adds token-based pagination for GetVtxoChain, a depth field on VTXOs, and corresponding migrations for both Postgres and SQLite.
This is a large PR (10.5K lines, ~4.2K non-test). I've focused on the data model, migration safety, security implications, and sweep correctness.
✅ Architecture — well designed
-
Marker model is clean:
MarkerInterval = 100, markers created at depth boundaries (0, 100, 200...), parent marker IDs for DAG traversal. TheNewMarker()function is deterministic and side-effect free. -
Swept marker as append-only table — excellent choice. Instead of mutating a
sweptboolean on individual VTXOs, inserting intoswept_markermarks all VTXOs sharing that marker as swept atomically. This is both faster (100x fewer writes) and more correct (no stale swept state). -
Dynamic swept computation via EXISTS subquery in
vtxo_vw:
EXISTS (
SELECT 1 FROM swept_marker sm
WHERE v.markers @> jsonb_build_array(sm.marker_id)
) AS sweptThis is always correct by construction — no stale state possible. The GIN index on markers makes this efficient.
-
Token-based pagination for
GetVtxoChain—encodeChainCursor/decodeChainCursorusing base64-encoded JSON is correct. The backward compat handling (nil page + empty token = full chain) preserves existing client behavior. -
Marker window prefetch in
ensureVtxosCached— when a cache miss occurs, loading all VTXOs sharing the same marker pre-warms the cache for subsequent BFS iterations. TheloadedMarkersdedup map prevents redundant marker loads.
🔒 Security
-
Sweep correctness — the bulk sweep path (
BulkSweepMarkers) sweeps by marker ID. Since markers are tied to depth boundaries and VTXOs inherit marker IDs from their parents, sweeping a marker sweeps exactly the VTXOs in its 100-depth window. No double-spend vector — a VTXO can only be swept once becauseswept_markerhas a PK onmarker_id. -
Migration safety — the migration adds columns with
DEFAULT 0/DEFAULT '[]', which is safe for existing data. The view recreation (DROP VIEW IF EXISTS+CREATE VIEW) is atomic within the migration transaction.⚠️ One concern: theDROP VIEW IF EXISTS vtxo_vwwill invalidate any in-flight queries using the old view. On a busy production server, this could cause brief errors. Consider deploying during low-traffic or ensuring the migration runs in a maintenance window. -
No exit path impact — unilateral exit paths are not modified. Marker/depth tracking is metadata for operational efficiency, not consensus-critical.
-
Existing VTXOs don't get markers — explicitly noted in the PR. Pre-migration VTXOs have
depth=0andmarkers=[], so they fall back to per-VTXO sweep behavior. This is correct — no false swept status.
🟡 Observations
-
GetVtxosByMarkererror is swallowed inensureVtxosCached— lineif err != nil { continue }. The test (TestEnsureVtxosCached_GetVtxosByMarkerErrorSwallowed) explicitly validates this behavior. This is acceptable since the VTXO itself is still cached via the direct DB lookup, but consider at least logging the error at debug level so operational issues with the marker index don't go unnoticed. -
SweepMarkerWithDescendants— recursive CTE for bulk sweeping is powerful but could be expensive on deep marker chains. Ensure there's a depth limit or that this is only called during scheduled sweeps, not on hot paths. -
SQLite migration is 138 lines — significantly more complex than Postgres due to SQLite's limited
ALTER TABLEsupport. Consider adding a comment in the migration file noting which sections are SQLite workarounds. -
Proto field number 15 for
depth— leaving gaps (14 isassets, 15 isdepth) is fine for proto. Just noting it's consistent across bothVtxoandIndexerVtxomessages. -
No down migration for swept_marker —
20260210100000_add_depth_and_markers.down.sqlexists but I couldn't verify its contents in this diff. Ensure it dropsswept_marker,marker, and removes thedepth/markerscolumns.
Cross-repo impact
- SDKs need to expose the new
depthfield on VTXOs — this is additive and non-breaking (new field, default 0). - Pagination via
pageToken/nextPageTokenonGetVtxoChainis additive — existing clients that don't set these fields get the old behavior. - The e2e test gap (noted in PR: "require changes to the SDK to support new proto field") should be tracked as a follow-up issue.
Test coverage
Test files account for ~62% of the diff. Key scenarios covered: cursor round-trip, cache hits/misses, marker dedup, DB error propagation, pagination (first page, resume with token, short chain, page size limits), and sweeper checkpoint creation with markers. Comprehensive for a change of this scope.
Verdict: Strong architectural improvement with clear performance wins. The marker model is well-designed and the migration is safe. The two items worth attention are: (1) logging swallowed errors in ensureVtxosCached, and (2) ensuring the view recreation in migration happens during low-traffic. LGTM for the design — recommend a second human review given the scope (10K+ lines touching DB schema, sweep logic, and indexer).
# Conflicts: # internal/core/application/utils_test.go
🔍 Review — Scale the DAG (marker-based VTXO chain optimization)Large PR (10.5k+ / 900-), but the core idea is sound: replace per-VTXO traversal with marker-based checkpoints every 100 depths, enabling bulk operations. 62% is test code. Reviewed the new domain model, DB migrations, indexer changes, sweeper refactor, and event plumbing. ✅ Architecture
|
🔍 Arkana PR Review — Scale the DAGSummary: Introduces a marker-based DAG checkpoint system for VTXO chains, replacing per-VTXO sweep operations with bulk marker-based sweeps and adding in-memory caching for Scale: +10,521 / -893 lines. ~62% tests, ~37% application code. 3 storage backends (Postgres, SQLite, Badger). Architecture Review: ✅ Sound designThe marker system is well-conceived:
Security Review
Specific Code Observations
Dust VTXO handling
Postgres migration
SQLite migration
Cross-Repo Impact
Test CoverageThorough — 856 lines for indexer tests, 1166 for sweeper tests, 661 for domain marker tests, 3427 for DB service tests (including pagination round-trips). Tests correctly simulate the real scenarios (cache misses, marker dedup, error swallowing, cross-page BFS). Items to Address
No blocking issues. Well-structured PR with strong test coverage. |
|
per #908 (comment) I have added the milliseconds change into the db migrations. Concerns 1 and 3 not worth addressing. Done in: 86f9190 |
🔍 Arkana PR Review — arkd#908 (follow-up)2 new commits since last review (
Assessment
No new concerns. Prior review findings still apply (page token signing, swept_at migration — both already addressed in earlier commits). |
* optimize GetVtxoChain with marker-based bulk preloading * Test to prove reductin in calls to GetVtxos * sorting in tests to guarantee vtxo id ordering
🔍 Arkana PR Review — arkd#908 (follow-up)1 new commit ( Code Review
This is the key piece that turns the O(N) per-VTXO DB round-trips into O(N/100) marker-based bulk fetches. Guard clause: Correctly checks Tests (287 new lines):
ObservationThe SecurityNo impact on exit paths, forfeit validation, or sweep correctness. The preload is purely a read-side optimization feeding the same BFS loop. ✅ No blocking issues. Tests are thorough and the optimization is well-structured. |
# Conflicts: # internal/interface/grpc/handlers/parser_test.go
|
👋 @bitcoin-coder-bob — this PR has been open for 39 days with outstanding change requests from @louisinger. Last push was over 30 days ago. Is this still being worked on, or is it blocked on something? Let us know if we can help unblock. |
|
🔍 Arkana PR Review (incremental — new commits since last review) Changes since last review: Added comprehensive benchmark suite ( Assessment: ✅ Excellent test additions Benchmarks (
New unit tests (
The No security concerns in this increment — all changes are test/bench code. The core DAG traversal and marker logic were reviewed in the previous pass. |
|
Hey @bitcoin-coder-bob — @louisinger requested changes on Mar 16. Any update on this? Has the feedback been addressed in recent commits? |
|
Iterative review — new commit This commit addresses a real race condition: the DB projection updates asynchronously, so The fix:
Review notes:
Good fix for a subtle but important race. 👍 |
Closes #833
Summary by CodeRabbit
pageTokenandnextPageTokenparameters.depthfield to VTXOs to track their position in transaction chains.NOTE:
depthon the vtxo) so they are not expanded on in this PRHeres a breakdown on the efficiency gains and db query savings:
Yes this PR has a lot of lines of code. 62% are in test files. 0.8% come from the
api-sepcfolder, and the other 37.2% are "actual" code changes (4,254 LoC)