-
Notifications
You must be signed in to change notification settings - Fork 57
Scale the DAG #833
Description
We want a schema and flow that:
- Scales to very deep DAGs (20k+ depth).
- Keeps sweeping as a cheap bulk operation.
- Lets us query
GetVtxoChainandListVtxos (is_swept)efficiently.
Core ideas:
- Track a
depthon everyvtxo(longest-path depth). - Every 100 depths, create a marker in a
markerstable. - Each marker records all parent markers (multi‑parent DAG).
- A
swept_markerstable isolates sweep state as bulk inserts only. vtxogets amarkers text[]column (the markers that “cover” it).- Drop
vtxo.swept; computeis_sweptby joining againstswept_markers. GetVtxoChainuses markers to chunk recursion / traversal into healthy pieces.
1. Schema Changes
1.1 vtxo table
ALTER TABLE vtxo
ADD COLUMN depth INT NOT NULL DEFAULT 0;
ALTER TABLE vtxo
ADD COLUMN markers TEXT[]; -- e.g. ARRAY['txid:vout', 'txid2:vout2'];
ALTER TABLE vtxo
DROP COLUMN swept;
CREATE INDEX CONCURRENTLY idx_vtxo_markers_gin
ON vtxo USING GIN (markers);1.2 markers table
CREATE TABLE markers (
-- A marker is a concrete vtxo outpoint: "txid:vout"
marker_id TEXT PRIMARY KEY,
depth INT NOT NULL,
parent_markers TEXT[] NOT NULL -- array of marker_ids (multi-parent DAG)
);
CREATE INDEX idx_markers_parent_gin
ON markers USING GIN (parent_markers);1.3 swept_markers table
CREATE TABLE swept_markers (
marker_id TEXT PRIMARY KEY,
swept_at BIGINT NOT NULL -- unix timestamp
);2. Ingestion Logic
Goal: For every new vtxo:
- Compute its
depth. - Assign its
markers text[]. - Optionally insert a new
markersrow if it sits on a 100‑depth boundary.
2.1 Depth
For a new tx with inputs parents []Vtxo:
func computeDepth(parents []Vtxo) int {
maxDepth := 0
for _, p := range parents {
if p.Depth > maxDepth {
maxDepth = p.Depth
}
}
return maxDepth + 1
}Store depth on the new vtxo.
2.2 Markers and marker rows
- Interval:
MarkerInterval = 100. markerId(v) = v.Txid + ":" + strconv.Itoa(v.Vout).
const MarkerInterval = 100
func assignMarkersAndMaybeCreateMarker(ctx context.Context, db *sql.DB, newVtxo *Vtxo, parents []Vtxo) error {
// depth already computed
depth := newVtxo.Depth
// union parent markers
parentMarkerSet := make(map[string]struct{})
for _, p := range parents {
for _, m := range p.Markers {
parentMarkerSet[m] = struct{}{}
}
}
parentMarkers := make([]string, 0, len(parentMarkerSet))
for m := range parentMarkerSet {
parentMarkers = append(parentMarkers, m)
}
if depth%MarkerInterval == 0 {
// This vtxo becomes a marker
id := newVtxo.Txid + ":" + strconv.Itoa(newVtxo.Vout)
_, err := db.ExecContext(ctx, `
INSERT INTO markers (marker_id, depth, parent_markers)
VALUES ($1, $2, $3)
ON CONFLICT (marker_id) DO NOTHING
`, id, depth, pq.Array(parentMarkers))
if err != nil {
return err
}
newVtxo.Markers = []string{id}
return nil
}
// Not a marker: inherit parent markers (multi‑parent)
newVtxo.Markers = parentMarkers
return nil
}Insert vtxo and the markers row in the same transaction.
3. Sweeper Logic
Goal: Sweeping should only be bulk inserts into swept_markers. No per‑vtxo updates.
3.1 Sweep from a root marker down
Given a starting marker_id (e.g. root or any “sweep point”), mark it and all descendant markers as swept:
WITH RECURSIVE target_markers AS (
SELECT marker_id
FROM markers
WHERE marker_id = $1
UNION ALL
SELECT m.marker_id
FROM markers m
JOIN target_markers t
ON t.marker_id = ANY(m.parent_markers)
)
INSERT INTO swept_markers (marker_id, swept_at)
SELECT marker_id, EXTRACT(EPOCH FROM NOW())::BIGINT
FROM target_markers
ON CONFLICT (marker_id) DO NOTHING;For multiple starting markers:
WITH RECURSIVE target_markers AS (
SELECT marker_id
FROM markers
WHERE marker_id = ANY($1::text[])
UNION ALL
SELECT m.marker_id
FROM markers m
JOIN target_markers t
ON t.marker_id = ANY(m.parent_markers)
)
INSERT INTO swept_markers (marker_id, swept_at)
SELECT DISTINCT marker_id, EXTRACT(EPOCH FROM NOW())::BIGINT
FROM target_markers
ON CONFLICT (marker_id) DO NOTHING;Sweeper service just calls this with the appropriate root marker(s).
4. ListVtxos / fetchVtxo: is_swept via view
Goal: is_swept must be cheap ($O(1)$) and derived from markers.
CREATE OR REPLACE VIEW vtxo_vw AS
SELECT
v.*,
EXISTS (
SELECT 1
FROM swept_markers sm
WHERE sm.marker_id = ANY(v.markers)
) AS is_swept
FROM vtxo v;ListVtxos can now select from vtxo_vw and expose is_swept without touching the sweeper logic.
Example:
SELECT txid, vout, depth, markers, is_swept, amount, expires_at
FROM vtxo_vw
WHERE pubkey = $1
ORDER BY created_at DESC
LIMIT $2;5. GetVtxoChain(outpoints): chunked DAG via markers
Goal: Given one or more outpoints, return a slim, paginated DAG of child → parents without recursing 20k+ steps in one go.
Two key pieces:
- Marker skeleton traversal (very light, every 100 depth).
- Gap fill per skeleton segment using
depthranges.
5.1 Marker skeleton (upward)
Start from the marker(s) that cover the initial outpoint(s) and walk parents up, limited per page.
WITH RECURSIVE marker_chain AS (
-- start marker for this page (derived from vtxo.markers of the starting outpoint)
SELECT marker_id, parent_markers, depth, 1 AS level
FROM markers
WHERE marker_id = $1
UNION ALL
SELECT m.marker_id, m.parent_markers, m.depth, mc.level + 1
FROM markers m
JOIN marker_chain mc
ON m.marker_id = ANY(mc.parent_markers)
WHERE mc.level < $2 -- e.g. limit to 50 markers per page
)
SELECT marker_id, parent_markers, depth
FROM marker_chain;Client encodes the last marker_id / level into page_token to request the next chunk.
5.2 Gap fill between markers (0–100 depth window)
For each [depth_lo, depth_hi] window (e.g. [1000, 1100]) defined by two markers, fetch the actual tx graph:
SELECT
t.txid,
COALESCE(v.expires_at, 0) AS expires_at,
t.type,
array_remove(array_agg(DISTINCT p.txid), NULL) AS spends
FROM tx t
JOIN vtxo v
ON v.txid = t.txid
LEFT JOIN vtxo p
ON p.spent_by = t.txid
WHERE v.depth BETWEEN $1 AND $2 -- e.g. 1000 and 1100
GROUP BY t.txid, v.expires_at, t.type
ORDER BY v.depth DESC; -- or ASC, depending on API semanticsEach row maps cleanly to IndexerChain:
message IndexerChain {
string txid = 1;
int64 expires_at = 2;
IndexerChainedTxType type = 3;
repeated string spends = 4;
}The service:
- Uses markers to find depth windows per page.
- Runs the gap‑fill query for that window.
- Builds
IndexerChainresults. - Returns a
page_tokenencoding the next marker / window.
6. Why this works
- Sweeper: Only runs bulk inserts into a small
swept_markerstable. - VTXO rows: Immutable w.r.t sweep; no 20k‑row UPDATEs.
is_swept: SimpleEXISTSovermarkers[]×swept_markers.GetVtxoChain: Works by:- traversing a 100× compressed marker graph (recursive, but light),
- then fetching bounded depth ranges per page (no deep recursion on raw vtxos).
This keeps writes cheap, reads predictable, and makes the DAG depth effectively unbounded while still API‑friendly.