Skip to content

Non-Idempotent Inserts in SQL-Backed MS-SMT #2116

@jtobin

Description

@jtobin

Found while resurrecting #770.

Summary

The SQL insert queries for mssmt_nodes use plain INSERT INTO,
which fails with a UNIQUE constraint violation when a node with the
same (hash_key, namespace) already exists. This affects both
SQLite and Postgres, since both backends share the same queries in
tapdb/sqlc/queries/mssmt.sql.

Root Cause

The three insert queries (InsertBranch, InsertLeaf,
InsertCompactedLeaf) all do strict inserts:

INSERT INTO mssmt_nodes (
    hash_key, l_hash_key, r_hash_key, key, value, sum, namespace
) VALUES ($1, $2, $3, NULL, NULL, $4, $5);

The primary key is (hash_key, namespace). In a Merkle tree,
identical hashes mean identical content, so re-inserting the same
hash is semantically a no-op. The in-memory DefaultStore uses a
Go map, where re-assignment silently succeeds. The SQL store does
not tolerate it.

How to Trigger

TestDeletion in mssmt/tree_test.go runs testDeletion and
testBatchDeletion sequentially on the same store.
testDeletion inserts 100 random leaves, then deletes them one by
one. testBatchDeletion then re-inserts the same 100 leaves into
the same store. During re-insertion, intermediate branch nodes
collide with nodes that survived the individual deletions (nodes
still referenced by other paths aren't deleted).

This produces two failures:

  • sqlite3/full_SMT: UNIQUE constraint violation during
    testBatchDeletion's Insert calls (line 606).
  • sqlite3/smol_SMT: After DeleteAllNodes, Get returns a
    non-nil leaf (line 618), likely due to the compacted tree's
    in-memory cache not being invalidated after the SQL-level bulk
    delete.

Practical Impact

The TaprootAssetTreeStore is the production SQL-backed MS-SMT
used for universe trees, asset commitment trees, etc. Production
usage patterns (single-leaf mutations per transaction) likely don't
trigger this, since the tree algorithm deletes old branch nodes
before inserting new ones along the same path. But any operation
that produces a branch node whose hash matches a node still live
elsewhere in the tree would fail.

Fix

Change the three insert queries to use INSERT OR IGNORE (SQLite)
/ ON CONFLICT DO NOTHING (Postgres). Since hash determines
content in a Merkle tree, ignoring duplicates is correct — a node
with the same hash is guaranteed to have the same data.

Discovery

This bug was found by PR #770, which registers the SQL tree store
driver so that mssmt unit tests actually run against the SQL
backend. Previously, genTestStores only produced constructors for
the in-memory DefaultStore because no TreeStoreDriver was ever
registered, making the sqlite3 guard in the loop dead code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdatabase

    Type

    No type

    Projects

    Status

    🆕 New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions