PAX(Partition Attributes Across) is an access method in database that combines the advantages of row storage (NSM, N-ary Storage Model) and column storage (DSM, Decomposition Storage Model), aiming to improve database query performance, especially in terms of cache efficiency.
In OLAP scenarios, PAX has batch write performance comparable to row storage and read performance comparable to column storage. PAX can adapt to both OSS storage models in cloud environments and traditional offline physical file-based storage methods.
PAX has the following features:
- CRUD Operations: Supports basic Create, Read, Update, and Delete capabilities
- Indexes Support: Supports multiple types of indexes to accelerate query operations, particularly enhancing data retrieval speed when handling large datasets.
- Concurrency Control & Isolation: Utilizes Multiversion concurrency control (MVCC) to achieve efficient concurrency management and read-write isolation, operating at the granularity of individual data files for improved security and performance.
- Data Encoding & Compression: Offers multiple encoding schemes (e.g., run-length encoding) and compression methods (e.g., zstd, zlib) with configurable levels, effectively reducing storage requirements while optimizing read performance.
- Statistics: Data files contain comprehensive statistical information used for rapid filtering and query optimization, minimizing unnecessary data scanning and accelerating query processing.
- Sparsing Filter & Row Filter: Enables efficient sparse filtering and row filtering through statistical metadata to reduce redundant data transfers.
- Data Clustering: Supports data sorting via specified algorithms on single or multiple columns to enhance data orderliness and query efficiency.
- Vectorized Engine: Includes an experimental vectorized execution engine designed to boost data processing capabilities and query performance, particularly for analytical workloads and report generation scenarios.
PAX will be built with --enable-pax when you build the Cloudberry. Dependency requirements are as follows:
- C/C++ Compiler: GCC/GCC-C++ 8 or later
- CMake: 3.11 or later
- Protobuf: 3.5.0 or later
- ZSTD (libzstd): 1.4.0 or later
Also, you need to run the following command at the top level of the Cloudberry source code directory to download the submodules:
git submodule update --init --recursive
The following submodules will be downloaded for building and tesing PAX:
- yyjson (
contrib/pax_storage/src/cpp/contrib/yyjson) - cpp-stub (
contrib/pax_storage/src/cpp/cotnrib) - googlebench (
contrib/pax_storage/src/cpp/cotnrib) - googletest (
contrib/pax_storage/src/cpp/cotnrib) - tabulate (
contrib/pax_storage/src/cpp/cotnrib)
When using the configure script in the Cloudberry, add --enable-cassert.
--enable-cassert will cause PAX to be compiled in DEBUG mode. And the GTEST in PAX will be built.
Run GTEST:
cd contrib/pax_storage/build
./src/cpp/test_main
To create a table in PAX format, you need to set the table access method to PAX. You can do this by following one of the following methods:
- Use the
USING PAXclause when creating the table
CREATE TABLE t1(a int, b int, c text) USING PAX;
- Set the default access mode for the table before create the table
-- set default table access method to PAX, All newly created tables will use the PAX.
SET default_table_access_method = pax;
-- Will create the PAX table
CREATE TABLE t1(a int, b int, c text);
Users can specify reloptions (such as compresstype, storage_format...) when creating a PAX table.
--- use the 'WITH' to specify reloptions
CREATE TABLE p2(a INT, b INT, c INT) USING pax WITH(compresstype='zstd');
--- use ',' to split multi reloptions
CREATE TABLE p2(a INT, b INT, c INT) USING pax WITH(compresstype='zstd', compresslevel=5);
PAX allows the user to use the ENCODING to specify compresstype and compresslevel. This allows different columns to use different compression/encoding algorithms.
CREATE TABLE t1 (c1 int ENCODING (compresstype=zlib),
c2 char ENCODING (compresstype=zstd, compresslevel=5),
c3 text, COLUMN c3 ENCODING (compresstype=RLE_TYPE))
USING PAX;
To check whether a table is in PAX format, use any of the following methods:
- Use
\d+ <tablename>
gpadmin=# \d+ t1
Table "public.t1"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+---------+----------+-------------+--------------+-------------
a | integer | | | | plain | | |
b | integer | | | | plain | | |
c | text | | | | extended | | |
Distributed by: (a)
Access method: pax
- Query the system catalog tables
pg_classandpg_am
SELECT relname, amname FROM pg_class, pg_am WHERE relam = pg_am.oid AND relname = 't1';
relname | amname
---------+--------
t1 | pax
(1 row)
For AM(access methods) in Cloudberry, each AM has customized relation options. Users can use these options in a WITH() clause, for example WITH(minmax_columns='b,c', storage_format=porc).
| Name | Type | Optional | Default | Description |
|---|---|---|---|---|
| storage_format | string | porcporc_vec |
porc |
Controls the internal storage format. |
| compresstype | string | nonerledeltazstdzlib |
none |
The way to compress or encode column values. You can only select one of them. |
| compresslevel | int | [0, 19] |
0 |
Specifies the compression level. Lower values prioritize faster compression speed, while higher values prioritize better compression ratios. This option takes effect only when used with the compresstype parameter. |
| minmax_columns | string | <column1>,<column2>,...,<columnN> |
none |
Specifies columns in which min/max statistics need to be generated . Example: WITH(minmax_columns='t1,t2,t3') |
| bloomfilter_columns | string | <column1>,<column2>,...,<columnN> |
none |
Specifies columns in which bloom filter statistics need to be generated. |
| cluster_type | string | zorderlexical |
none |
Specify the clustering algorithm. |
| cluster_columns | string | <column1>,<column2>,...,<columnN> |
none |
Specify which columns need to be clustered. |
| Name | Type | Optional | Default | Description |
|---|---|---|---|---|
| pax.enable_sparse_filter | bool | on/off |
on |
Specifies whether to enable sparse filtering based on statistics. |
| pax.enable_row_filter | bool | on/off |
off |
Specifies whether to enable row filtering. |
| pax.scan_reuse_buffer_size | int | [1048576, 33554432] | 8388608 | The buffer block size used during scanning. |
| pax.max_tuples_per_group | int | [5, 524288] | 131072 | Specifies the maximum number of tuples allowed in each group. |
| pax.max_tuples_per_file | int | [131072, 8388608] | 1310720 | Specifies the maximum number of tuples allowed in each data file. |
| pax.max_size_per_file | int | [8388608, 335544320] | 67108864 | The maximum physical size allowed for each data file. The default value is 67108864 (64MiB). The actual file size might be slightly larger than the set size. Very large or small values might negatively impact performance. |
| pax.enable_toast | bool | on/off |
on |
Specifies whether to enable TOAST support. |
| pax.min_size_of_compress_toast | int | [524288, 1073741824] | 524288 | Specifies the threshold for creating compressed TOAST tables. If the character length exceeds this threshold, Cloudberry creates compressed TOAST tables for storage. |
| pax.min_size_of_external_toast | int | [10485760, 2147483647] | 10485760 | Specifies the threshold for creating external TOAST tables. If the character length exceeds this threshold, Cloudberry creates external TOAST tables for storage. |
| pax.default_storage_format | string | porc/porc_vec |
porc |
Controls the default storage format. |
| pax.bloom_filter_work_memory_bytes | int | [1024, 2147483647] | 10240 | Controls the maximum memory allowed for bloom filter usage. |