You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the [previous section](/use-cases/data-lake/getting-started/connecting-catalogs), you connected ClickHouse to a data catalog and queried open table formats directly. While querying data in place is convenient, lakehouse formats are not optimized for the low-latency, high-concurrency workloads that power dashboards and operational reporting. For these use cases, loading data into ClickHouse's [MergeTree](/engines/table-engines/mergetree-family/mergetree) engine delivers dramatically better performance.
14
+
In the [previous section](/use-cases/data-lake/getting-started/connecting-catalogs), you connected ClickHouse to a data catalog and queried open table formats directly. While querying data in place is convenient, open table formats are not optimized for the low-latency, high-concurrency workloads that power dashboards and operational reporting. For these use cases, loading data into ClickHouse's [MergeTree](/engines/table-engines/mergetree-family/mergetree) engine delivers dramatically better performance.
15
15
16
16
MergeTree offers several advantages over reading open table formats directly:
17
17
@@ -90,7 +90,7 @@ FROM unity.`icebench.single_day_log`
90
90
1 row inset. Elapsed: 1.265 sec.
91
91
```
92
92
93
-
## Query over the lakehouse table {#query-lakehouse}
93
+
## Query over the data lake table {#query-lakehouse}
94
94
95
95
Let's run a query that filters logs by thread name and instance type, searches the message text for errors, and groups results by logger:
96
96
@@ -163,7 +163,7 @@ ORDER BY (instance_type, thread_name, toStartOfMinute(event_time))
163
163
164
164
### Insert data from the catalog {#insert-data}
165
165
166
-
Use `INSERT INTO SELECT` to load the ~300m from the lakehouse table into our ClickHouse table:
166
+
Use `INSERT INTO SELECT` to load the ~300m from the data lake table into our ClickHouse table:
167
167
168
168
```sql
169
169
INSERT INTO single_day_log SELECT*FROM icebench.`icebench.single_day_log`
In the previous guides, you queried open table formats in place and loaded data into MergeTree for fast analytics. In many architectures, data also needs to flow in the other direction - from ClickHouse back into lakehouse formats. Two common scenarios drive this:
14
+
In the previous guides, you queried open table formats in place and loaded data into MergeTree for fast analytics. In many architectures, data also needs to flow in the other direction - from ClickHouse back into open table formats. Two common scenarios drive this:
15
15
16
16
-**Offloading to long-term storage** - Data arrives in ClickHouse as a real-time analytics layer, powering dashboards and operational reporting. Once the data ages beyond its real-time window, it can be written out to Iceberg in object storage for durable, cost-effective retention in an interoperable format.
17
17
-**Reverse ETL** - Transformations, aggregations, and enrichment performed inside ClickHouse produce derived datasets that downstream tools and other teams need to consume. Writing these results to Iceberg tables makes them available across the broader data ecosystem.
ClickHouse integrates with open lakehouse table formats, including [Apache Iceberg](/engines/table-engines/integrations/iceberg), [Delta Lake](/engines/table-engines/integrations/deltalake), [Apache Hudi](/engines/table-engines/integrations/hudi), and [Apache Paimon](/sql-reference/table-functions/paimon). This allows users to connect ClickHouse to data already stored in these formats across object storage, combining the analytical power of ClickHouse with their existing data lake infrastructure.
11
+
ClickHouse integrates with open table formats, including [Apache Iceberg](/engines/table-engines/integrations/iceberg), [Delta Lake](/engines/table-engines/integrations/deltalake), [Apache Hudi](/engines/table-engines/integrations/hudi), and [Apache Paimon](/sql-reference/table-functions/paimon). This allows users to connect ClickHouse to data already stored in these formats across object storage, combining the analytical power of ClickHouse with their existing data lake infrastructure.
12
12
13
13
## Why use ClickHouse with open table formats? {#why-clickhouse-uses-lake-formats}
14
14
15
15
### Query existing data in place {#querying-data-in-place}
16
16
17
-
ClickHouse can query open table formats directly in object storage without duplicating data. Organizations standardized on Iceberg, Delta Lake, Hudi, or Paimon can point ClickHouse at existing tables and immediately use its SQL dialect, analytical functions, and efficient native Parquet reader. At the same time, tools like [clickhouse-local](/operations/utilities/clickhouse-local) and [chDB](/chdb) enable exploratory, ad hoc analysis across more than 70 file formats in remote storage, allowing users to interactively explore lakehouse datasets with no infrastructure setup.
17
+
ClickHouse can query open table formats directly in object storage without duplicating data. Organizations standardized on Iceberg, Delta Lake, Hudi, or Paimon can point ClickHouse at existing tables and immediately use its SQL dialect, analytical functions, and efficient native Parquet reader. At the same time, tools like [clickhouse-local](/operations/utilities/clickhouse-local) and [chDB](/chdb) enable exploratory, ad hoc analysis across more than 70 file formats in remote storage, allowing users to interactively explore data lake datasets with no infrastructure setup.
18
18
19
19
Users can achieve this with either direct reading, using [table functions and table engines](/use-cases/data-lake/getting-started/querying-directly), or by [connecting to a data catalogue](/use-cases/data-lake/getting-started/connecting-catalogs).
This page provides comprehensive support matrices for ClickHouse's lakehouse integrations. It covers the features available for each lakehouse table format, the catalogs ClickHouse can connect to, and the capabilities supported by each catalog.
16
+
This page provides comprehensive support matrices for ClickHouse's data lake integrations. It covers the features available for each open table format, the catalogs ClickHouse can connect to, and the capabilities supported by each catalog.
17
17
18
-
## Lakehouse format support {#format-support}
18
+
## Open table format support {#format-support}
19
19
20
-
ClickHouse integrates with four lakehouse table formats: [Apache Iceberg](/engines/table-engines/integrations/iceberg), [Delta Lake](/engines/table-engines/integrations/deltalake), [Apache Hudi](/engines/table-engines/integrations/hudi), and [Apache Paimon](/sql-reference/table-functions/paimon). Select a format below to view its support matrix.
20
+
ClickHouse integrates with four open table formats: [Apache Iceberg](/engines/table-engines/integrations/iceberg), [Delta Lake](/engines/table-engines/integrations/deltalake), [Apache Hudi](/engines/table-engines/integrations/hudi), and [Apache Paimon](/sql-reference/table-functions/paimon). Select a format below to view its support matrix.
0 commit comments