Skip to content

Latest commit

 

History

History
177 lines (131 loc) · 12.1 KB

File metadata and controls

177 lines (131 loc) · 12.1 KB

PAX

Overview

PAX(Partition Attributes Across) is an access method in database that combines the advantages of row storage (NSM, N-ary Storage Model) and column storage (DSM, Decomposition Storage Model), aiming to improve database query performance, especially in terms of cache efficiency.

In OLAP scenarios, PAX has batch write performance comparable to row storage and read performance comparable to column storage. PAX can adapt to both OSS storage models in cloud environments and traditional offline physical file-based storage methods.

PAX has the following features:

  • CRUD Operations: Supports basic Create, Read, Update, and Delete capabilities
  • Indexes Support: Supports multiple types of indexes to accelerate query operations, particularly enhancing data retrieval speed when handling large datasets.
  • Concurrency Control & Isolation: Utilizes Multiversion concurrency control (MVCC) to achieve efficient concurrency management and read-write isolation, operating at the granularity of individual data files for improved security and performance.
  • Data Encoding & Compression: Offers multiple encoding schemes (e.g., run-length encoding) and compression methods (e.g., zstd, zlib) with configurable levels, effectively reducing storage requirements while optimizing read performance.
  • Statistics: Data files contain comprehensive statistical information used for rapid filtering and query optimization, minimizing unnecessary data scanning and accelerating query processing.
  • Sparsing Filter & Row Filter: Enables efficient sparse filtering and row filtering through statistical metadata to reduce redundant data transfers.
  • Data Clustering: Supports data sorting via specified algorithms on single or multiple columns to enhance data orderliness and query efficiency.
  • Vectorized Engine: Includes an experimental vectorized execution engine designed to boost data processing capabilities and query performance, particularly for analytical workloads and report generation scenarios.

Build

Build PAX

PAX will be built with --enable-pax when you build the Cloudberry. Dependency requirements are as follows:

  • C/C++ Compiler: GCC/GCC-C++ 8 or later
  • CMake: 3.11 or later
  • Protobuf: 3.5.0 or later
  • ZSTD (libzstd): 1.4.0 or later

Also, you need to run the following command at the top level of the Cloudberry source code directory to download the submodules:

git submodule update --init --recursive

The following submodules will be downloaded for building and tesing PAX:

  • yyjson (contrib/pax_storage/src/cpp/contrib/yyjson)
  • cpp-stub (contrib/pax_storage/src/cpp/cotnrib)
  • googlebench (contrib/pax_storage/src/cpp/cotnrib)
  • googletest (contrib/pax_storage/src/cpp/cotnrib)
  • tabulate (contrib/pax_storage/src/cpp/cotnrib)

Build debug version

When using the configure script in the Cloudberry, add --enable-cassert.

--enable-cassert will cause PAX to be compiled in DEBUG mode. And the GTEST in PAX will be built.

Run GTEST:

cd contrib/pax_storage/build
./src/cpp/test_main

Usage

Create PAX table

To create a table in PAX format, you need to set the table access method to PAX. You can do this by following one of the following methods:

  • Use the USING PAX clause when creating the table
CREATE TABLE t1(a int, b int, c text) USING PAX;
  • Set the default access mode for the table before create the table
-- set default table access method to PAX, All newly created tables will use the PAX.
SET default_table_access_method = pax;

-- Will create the PAX table
CREATE TABLE t1(a int, b int, c text);

Create PAX table with reloptions

Users can specify reloptions (such as compresstype, storage_format...) when creating a PAX table.

--- use the 'WITH' to specify reloptions
CREATE TABLE p2(a INT, b INT, c INT) USING pax WITH(compresstype='zstd');

--- use ',' to split multi reloptions
CREATE TABLE p2(a INT, b INT, c INT) USING pax WITH(compresstype='zstd', compresslevel=5);

Create PAX table with column encoding options

PAX allows the user to use the ENCODING to specify compresstype and compresslevel. This allows different columns to use different compression/encoding algorithms.

CREATE TABLE t1 (c1 int ENCODING (compresstype=zlib),
                  c2 char ENCODING (compresstype=zstd, compresslevel=5),
                  c3 text, COLUMN c3 ENCODING (compresstype=RLE_TYPE))
                  USING PAX;

Show PAX table

To check whether a table is in PAX format, use any of the following methods:

  • Use \d+ <tablename>
gpadmin=# \d+ t1
                                            Table "public.t1"
 Column |  Type   | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+---------+-----------+----------+---------+----------+-------------+--------------+-------------
 a      | integer |           |          |         | plain    |             |              |
 b      | integer |           |          |         | plain    |             |              |
 c      | text    |           |          |         | extended |             |              |
Distributed by: (a)
Access method: pax
  • Query the system catalog tables pg_class and pg_am
SELECT relname, amname FROM pg_class, pg_am WHERE relam = pg_am.oid AND relname = 't1';

 relname | amname
---------+--------
 t1      | pax
(1 row)

Options

For AM(access methods) in Cloudberry, each AM has customized relation options. Users can use these options in a WITH() clause, for example WITH(minmax_columns='b,c', storage_format=porc).

Name Type Optional Default Description
storage_format string porc
porc_vec
porc Controls the internal storage format.
compresstype string none
rle
delta
zstd
zlib
none The way to compress or encode column values. You can only select one of them.
compresslevel int [0, 19] 0 Specifies the compression level. Lower values prioritize faster compression speed, while higher values prioritize better compression ratios. This option takes effect only when used with the compresstype parameter.
minmax_columns string <column1>,<column2>,...,<columnN> none Specifies columns in which min/max statistics need to be generated . Example: WITH(minmax_columns='t1,t2,t3')
bloomfilter_columns string <column1>,<column2>,...,<columnN> none Specifies columns in which bloom filter statistics need to be generated.
cluster_type string zorder
lexical
none Specify the clustering algorithm.
cluster_columns string <column1>,<column2>,...,<columnN> none Specify which columns need to be clustered.

GUC

Name Type Optional Default Description
pax.enable_sparse_filter bool on/off on Specifies whether to enable sparse filtering based on statistics.
pax.enable_row_filter bool on/off off Specifies whether to enable row filtering.
pax.scan_reuse_buffer_size int [1048576, 33554432] 8388608 The buffer block size used during scanning.
pax.max_tuples_per_group int [5, 524288] 131072 Specifies the maximum number of tuples allowed in each group.
pax.max_tuples_per_file int [131072, 8388608] 1310720 Specifies the maximum number of tuples allowed in each data file.
pax.max_size_per_file int [8388608, 335544320] 67108864 The maximum physical size allowed for each data file. The default value is 67108864 (64MiB). The actual file size might be slightly larger than the set size. Very large or small values might negatively impact performance.
pax.enable_toast bool on/off on Specifies whether to enable TOAST support.
pax.min_size_of_compress_toast int [524288, 1073741824] 524288 Specifies the threshold for creating compressed TOAST tables. If the character length exceeds this threshold, Cloudberry creates compressed TOAST tables for storage.
pax.min_size_of_external_toast int [10485760, 2147483647] 10485760 Specifies the threshold for creating external TOAST tables. If the character length exceeds this threshold, Cloudberry creates external TOAST tables for storage.
pax.default_storage_format string porc/porc_vec porc Controls the default storage format.
pax.bloom_filter_work_memory_bytes int [1024, 2147483647] 10240 Controls the maximum memory allowed for bloom filter usage.