Skip to content

[Enhancement]: Time partition a hypertable using CUID #9686

@arimdjonokszabbix

Description

@arimdjonokszabbix

What type of enhancement is this?

Performance

What subsystems and features will be improved?

Partitioning, Query planner

What does the enhancement do?

Zabbix uses CUIDs for primary keys and needs TimescaleDB support for time partitioning and planner execution.

The original optimization issue is described in Zabbix ticket ZBX-25808.

Initially we had an auditlog table:

zabbix=> \d auditlog;
                                Table "public.auditlog"
    Column     |          Type          | Collation | Nullable |        Default
---------------+------------------------+-----------+----------+-----------------------
 auditid       | character varying(25)  |           | not null |
 userid        | bigint                 |           |          |
 username      | character varying(100) |           | not null | ''::character varying
 clock         | integer                |           | not null | 0
 ip            | character varying(39)  |           | not null | ''::character varying
 action        | integer                |           | not null | 0
 resourcetype  | integer                |           | not null | 0
 resourceid    | bigint                 |           |          |
 resource_cuid | character varying(25)  |           |          |
 resourcename  | character varying(255) |           | not null | ''::character varying
 recordsetid   | character varying(25)  |           | not null |
 details       | text                   |           | not null | ''::text
Indexes:
    "auditlog_pkey" PRIMARY KEY, btree (auditid)
    "auditlog_1" btree (userid, clock)
    "auditlog_2" btree (clock)
    "auditlog_3" btree (resourcetype, resourceid)
    "auditlog_4" btree (recordsetid)
    "auditlog_5" btree (ip)

with data like:

zabbix=> select * from auditlog limit 1;
          auditid          | userid | username |   clock    |      ip      | action | resourcetype | resourceid | resource_cuid | resourcename |        recordsetid        | details
---------------------------+--------+----------+------------+--------------+--------+--------------+------------+---------------+--------------+---------------------------+---------
 cmnzmt5pp0001c8iywcc35czo |      1 | Admin    | 1776232231 | 192.168.64.1 |      8 |            0 |          1 |               |              | cmnzmt5pp0000c8iyuhs5ng5e |
(1 row)

that seemed like a nice candidate to be a hyper table.

The problems where:

  • auditlog table primary key is CUID. (which was added as a collision resistant efficient id so that UI and back-end would not need to block each other when generating ids).

  • TimescaleDB requires the primary key to have clock in its primary key.

  • Modifying the primary key at that time for existing tables and adding clock is a very costly operation

So, as a workaround the time_partitioning_func() function was used:

        PERFORM create_hypertable('auditlog', 'auditid', chunk_time_interval => 604800,
                        time_partitioning_func => 'cuid_timestamp', migrate_data => true, if_not_exists => true);

CREATE OR REPLACE FUNCTION cuid_timestamp(cuid varchar(25)) RETURNS integer AS $$
DECLARE
        base36 varchar;
        a char[];
        ret bigint;
        i int;
        val int;
        chars varchar;
BEGIN
        base36 := substring(cuid FROM 2 FOR 8);

        chars := '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ';

        FOR i IN REVERSE char_length(base36)..1 LOOP
                a := a || substring(upper(base36) FROM i FOR 1)::char;
        END LOOP;
        i := 0;
        ret := 0;
        WHILE i < (array_length(a, 1)) LOOP
                val := position(a[i + 1] IN chars) - 1;
                ret := ret + (val * (36 ^ i));
                i := i + 1;
        END LOOP;

        RETURN CAST(ret/1000 AS integer);
END;
$$ LANGUAGE 'plpgsql' IMMUTABLE;
DROP FUNCTION IF EXISTS base36_decode(character varying);

which extracts clock from the cuid.
And that worked fined - the automatic time-series partitioning worked.

But then slow queries were reported.

Problem is that query for displaying audit entries in front-end is like
SELECT a.auditid,a.userid,a.username,a.clock,a.action,a.resourcetype,a.ip,a.resourceid,a.resourcename,a.details,a.recordsetid FROM auditlog a WHERE a.clock>='1713106692' AND a.clock<='1776178692' ORDER BY a.auditid DESC LIMIT 1001;

Which results into scanning of all chunks:

Limit  (cost=233.21..5024.03 rows=1001 width=395) (actual time=390.724..391.023 rows=0 loops=1)
  ->  Merge Append  (cost=233.21..95082.84 rows=19818 width=395) (actual time=390.721..391.019 rows=0 loops=1)
        Sort Key: a_1.auditid DESC
        ->  Index Scan Backward using "655_655_auditlog_pkey" on _hyper_8_655_chunk a_1  (cost=0.12..8.14 rows=1 width=275) (actual time=0.031..0.031 rows=0 loops=1)
              Filter: ((clock >= 1713106692) AND (clock <= 1776178692))
... hundreds of chunks:
...
        ->  Index Scan Backward using "114_114_auditlog_pkey" on _hyper_8_114_chunk a_760  (cost=0.38..11.15 rows=1 width=275) (actual time=0.887..0.887 rows=0 loops=1)
              Filter: ((clock >= 1713106692) AND (clock <= 1776178692))
Planning Time: 687.697 ms
Execution Time: 397.334 ms
(1526 rows)


So, i am trying to fix this issue without adding new expensive indexes to auditlog table.

At first it seemed like this should be easy to do  - if time_partitioning_func() function is supported for insertion, then I thought something similar should also exist for search hints for planner.

But I have not found anything better than doing the manual pruning myself:

  1. finding the chunks to query
zabbix=> SELECT
    c.schema_name,
    c.table_name,
    ds.range_start,
    ds.range_end
FROM _timescaledb_catalog.chunk c
JOIN _timescaledb_catalog.chunk_constraint cc
  ON cc.chunk_id = c.id
JOIN _timescaledb_catalog.dimension_slice ds
  ON ds.id = cc.dimension_slice_id
JOIN _timescaledb_catalog.dimension d
  ON d.id = ds.dimension_id
WHERE d.hypertable_id = (
    SELECT id
    FROM _timescaledb_catalog.hypertable
    WHERE table_name = 'auditlog'
)
AND d.interval_length IS NOT NULL
AND ds.range_start < 1776178692
AND ds.range_end   > 1774278692
ORDER BY ds.range_start;
      schema_name      |     table_name     | range_start | range_end
-----------------------+--------------------+-------------+------------
 _timescaledb_internal | _hyper_8_102_chunk |  1773878400 | 1774483200
 _timescaledb_internal | _hyper_8_103_chunk |  1774483200 | 1775088000
 _timescaledb_internal | _hyper_8_104_chunk |  1775088000 | 1775692800
(4 rows)
  1. and then doing the search on those chunks:
 CREATE OR REPLACE FUNCTION cuid_upper(ts BIGINT)
RETURNS TEXT AS $$
BEGIN
    RETURN 'c' ||
           lpad(base36_encode(ts * 1000), 8, '0') ||
           repeat('z', 16);
END;
$$ LANGUAGE plpgsql IMMUTABLE;

CREATE OR REPLACE FUNCTION cuid_lower(ts BIGINT)
RETURNS TEXT AS $$
BEGIN
    RETURN 'c' ||
           lpad(base36_encode(ts * 1000), 8, '0') ||
           repeat('0', 16);
END;
$$ LANGUAGE plpgsql IMMUTABLE;


zabbix=> explain analyze SELECT *
FROM (
    SELECT * FROM _timescaledb_internal._hyper_8_102_chunk
    UNION ALL
    SELECT * FROM _timescaledb_internal._hyper_8_103_chunk
    UNION ALL
    SELECT * FROM _timescaledb_internal._hyper_8_104_chunk
) t
WHERE auditid >= cuid_lower(1776178692)
  AND auditid <  cuid_upper(1774278692)
ORDER BY auditid DESC
LIMIT 1001;
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Limit  (cost=0.01..0.02 rows=1 width=1094) (actual time=0.022..0.023 rows=0 loops=1)
   ->  Sort  (cost=0.01..0.02 rows=0 width=1094) (actual time=0.020..0.021 rows=0 loops=1)
         Sort Key: t.auditid DESC
         Sort Method: quicksort  Memory: 25kB
         ->  Result  (cost=0.00..0.00 rows=0 width=0) (actual time=0.002..0.002 rows=0 loops=1)
               One-Time Filter: false
 Planning Time: 1.451 ms
 Execution Time: 0.0

Which does seem to work.

Of course this is not ideal and I am was searching a better way of doing this (without adding new indexes or modifying primary key).

And then I realized that recent TimescaleDB versions dropped support for time_partitioning_func() and replaced it with partition_func().

Probably for the same reason that planner cannot use this time_partitioning_func() ..
And the new philosophy is that this new partition_func() should be more stricter but planner aware..
UUID-v7 seems to be well supported for this purpose, but Zabbix for long time has been using CUIDs and we cannot easily replace them.
Adding native support for CUIDs in TimescaleDB would greatly assist us.

Implementation challenges

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAn enhancement to an existing feature for functionality

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions