The production instance of tiled.nsls2.bnl.gov now holds almost 70 million nodes.
tiled_catalog_prod=# select count(*) from nodes;
count
----------
66946050
(1 row)
We have followed the mantra, "Make it work, make it right, make it fast" (in that order), and we have seen that queries such as, "Fetch the single most recent scan for this beamline," are slow. There is some quite low-hanging fruit to make Tiled faster. The index created by this expression is not having the intended effect.
|
Index( |
|
"top_level_metadata", |
|
"parent", |
|
# include the keys of the default sorting ('time_created', 'id'), |
|
# used to avoid creating a temp sort index |
|
"time_created", |
|
"id", |
|
"metadata", |
|
"access_blob", |
|
postgresql_using="gin", |
|
), |
PostgreSQL accepts this, unfortunately, but it's a bit of a foot-gun. We can see that in production, the index is not effective.
tiled_catalog_prod=# SELECT indexrelname, idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
WHERE relname = 'nodes'
ORDER BY indexrelname;
indexrelname | idx_scan | idx_tup_read | idx_tup_fetch
------------------------------+------------+--------------+---------------
ix_nodes_parent | 22364609 | 75956623183 | 65078973378
key_parent_unique_constraint | 366223859 | 1609316142 | 1608928295
metadata_tsvector_search | 71 | 3320247 | 0
nodes_pkey | 1363555876 | 1363934256 | 1217639562
top_level_metadata | 4504 | 83046299 | 0
(5 rows)
The key here is 0 heap fetches:
top_level_metadata | 4504 | 83046299 | 0
The planner occasionally tries it, gets nothing useful, and fetches rows from elsewhere.
On a local test database with just 100 nodes, we can see that replacing this useless index with separately useful ones is effective, changing a (slow Seq Scan on nodes to a (fast) Index Scan
postgres=# EXPLAIN ANALYZE SELECT * FROM nodes WHERE parent=0 ORDER BY time_created, id LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Limit (cost=3.76..3.77 rows=1 width=114) (actual time=0.097..0.098 rows=1 loops=1)
-> Sort (cost=3.76..4.01 rows=100 width=114) (actual time=0.094..0.095 rows=1 loops=1)
Sort Key: time_created, id
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on nodes (cost=0.00..3.26 rows=100 width=114) (actual time=0.014..0.035 rows=100 loops=1)
Filter: (parent = 0)
Rows Removed by Filter: 1
Planning Time: 0.244 ms
Execution Time: 0.127 ms
(9 rows)
postgres=# -- Drop useless index
DROP INDEX CONCURRENTLY top_level_metadata;
-- Create separate GIN indexes of JSON columns and a B-tree index on the others.
CREATE INDEX CONCURRENTLY ix_nodes_access_blob
ON nodes USING GIN (access_blob);
CREATE INDEX CONCURRENTLY ix_nodes_metadata
ON nodes USING GIN (metadata jsonb_path_ops);
CREATE INDEX CONCURRENTLY ix_nodes_parent_time_id
ON nodes (parent, time_created, id);
DROP INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
postgres=# EXPLAIN ANALYZE SELECT * FROM nodes WHERE parent=0 ORDER BY time_created, id LIMIT 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.14..0.30 rows=1 width=114) (actual time=0.017..0.018 rows=1 loops=1)
-> Index Scan using ix_nodes_parent_time_id on nodes (cost=0.14..16.20 rows=100 width=114) (actual time=0.014..0.015 rows=1 loops=1)
Index Cond: (parent = 0)
Planning Time: 0.508 ms
Execution Time: 0.046 ms
(5 rows)
Further testing, incorporating queries on the GIN indexes on metadata and access_blob, will require larger sample data sizes. More to follow.
The production instance of
tiled.nsls2.bnl.govnow holds almost 70 million nodes.We have followed the mantra, "Make it work, make it right, make it fast" (in that order), and we have seen that queries such as, "Fetch the single most recent scan for this beamline," are slow. There is some quite low-hanging fruit to make Tiled faster. The index created by this expression is not having the intended effect.
tiled/tiled/catalog/orm.py
Lines 101 to 111 in b98e34b
PostgreSQL accepts this, unfortunately, but it's a bit of a foot-gun. We can see that in production, the index is not effective.
The key here is 0 heap fetches:
The planner occasionally tries it, gets nothing useful, and fetches rows from elsewhere.
On a local test database with just 100 nodes, we can see that replacing this useless index with separately useful ones is effective, changing a (slow
Seq Scan on nodesto a (fast)Index ScanFurther testing, incorporating queries on the GIN indexes on
metadataandaccess_blob, will require larger sample data sizes. More to follow.