-
Notifications
You must be signed in to change notification settings - Fork 132
Open
Description
Background
The QueryUniverseAssetStats
query in tapdb/sqlc/queries/universe.sql:225-312
is a complex analytical query that:
- Calculates asset supply by aggregating universe leaves
- Computes group supply from MSSMT nodes
- Joins multiple tables including views (
genesis_info_view
,key_group_info_view
) - Performs dynamic sorting across 8 different columns with ASC/DESC options
Performance Concerns
- Multiple CTEs with aggregations: The query uses
asset_supply
andgroup_supply
CTEs that performSUM()
operations across large datasets - Complex joins: Joins across 7+ tables/views including
universe_leaves
,universe_roots
,mssmt_nodes
,genesis_info_view
,key_group_info_view
, anduniverse_stats
- Dynamic sorting: The ORDER BY clause has 16 different CASE statements for flexible sorting
Proposed Optimizations
1. ANALYZE Strategy
Implement regular ANALYZE operations to ensure PostgreSQL has accurate statistics:
-- Run after bulk imports or significant data changes
ANALYZE universe_leaves;
ANALYZE universe_roots;
ANALYZE mssmt_nodes;
ANALYZE mssmt_roots;
ANALYZE universe_stats;
-- Consider more frequent targeted ANALYZE for high-churn tables
ANALYZE universe_leaves (leaf_node_key, leaf_node_namespace, universe_root_id);
ANALYZE mssmt_nodes (key, namespace, sum);
2. Materialized View Approach
Create a materialized view to pre-compute the expensive aggregations:
CREATE MATERIALIZED VIEW universe_asset_stats_mv AS
WITH asset_supply AS (
SELECT SUM(nodes.sum) AS supply, gen.asset_id AS asset_id
FROM universe_leaves leaves
JOIN universe_roots roots ON leaves.universe_root_id = roots.id
JOIN mssmt_nodes nodes ON leaves.leaf_node_key = nodes.key
AND leaves.leaf_node_namespace = nodes.namespace
JOIN genesis_info_view gen ON leaves.asset_genesis_id = gen.gen_asset_id
WHERE roots.proof_type = 'issuance'
GROUP BY gen.asset_id
),
group_supply AS (
SELECT sum AS num_assets, uroots.group_key AS group_key
FROM mssmt_nodes nodes
JOIN mssmt_roots roots ON nodes.hash_key = roots.root_hash
AND nodes.namespace = roots.namespace
JOIN universe_roots uroots ON roots.namespace = uroots.namespace_root
WHERE uroots.proof_type = 'issuance'
)
SELECT
asset_supply.supply,
group_supply.num_assets AS group_supply,
gen.asset_id,
gen.asset_tag AS asset_name,
gen.asset_type,
gen.block_height AS genesis_height,
gen.prev_out AS genesis_prev_out,
group_info.tweaked_group_key AS group_key,
gen.output_index AS anchor_index,
gen.anchor_txid,
universe_stats.total_asset_syncs AS total_syncs,
universe_stats.total_asset_proofs AS total_proofs
FROM genesis_info_view gen
JOIN asset_supply ON asset_supply.asset_id = gen.asset_id
LEFT JOIN key_group_info_view group_info ON gen.gen_asset_id = group_info.gen_asset_id
LEFT JOIN group_supply ON group_supply.group_key = group_info.x_only_group_key
JOIN universe_stats ON gen.asset_id = universe_stats.asset_id
WHERE universe_stats.proof_type = 'issuance';
-- Create indexes for common access patterns
CREATE INDEX idx_universe_asset_stats_mv_asset_id ON universe_asset_stats_mv(asset_id);
CREATE INDEX idx_universe_asset_stats_mv_asset_name ON universe_asset_stats_mv(asset_name);
CREATE INDEX idx_universe_asset_stats_mv_asset_type ON universe_asset_stats_mv(asset_type);
3. Refresh Strategy
-- Option 1: Periodic refresh (e.g., every hour)
REFRESH MATERIALIZED VIEW CONCURRENTLY universe_asset_stats_mv;
-- Option 2: Event-driven refresh after universe updates
-- Could be triggered after sync operations or new proof insertions
Benefits
- Reduced query complexity: Pre-computed aggregations eliminate expensive runtime calculations
- Improved response times: Direct queries against materialized view vs. complex joins
- Predictable performance: Less variance in query execution times
- Concurrent refresh: Using
CONCURRENTLY
allows reads during refresh
Considerations
- Storage overhead for materialized view
- Staleness window between refreshes
- Need to modify application code to query the materialized view when appropriate
Next Steps
- Benchmark current query performance with production-like data volumes
- Test materialized view approach in staging environment
- Implement ANALYZE automation in maintenance routines
- Consider partial indexes on sorting columns if specific sort patterns dominate