Skip to content

Commit 5abc76b

Browse files
authored
Merge pull request #5745 from ClickHouse/fix-lakhouse-terminology
fixing lakehouse terminology
2 parents 8654a73 + d2c22c3 commit 5abc76b

5 files changed

Lines changed: 12 additions & 12 deletions

File tree

docs/use-cases/data_lake/getting-started.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: 'Getting started with lakehouse table formats'
2+
title: 'Getting started with open table formats'
33
sidebar_label: 'Getting started'
44
slug: /use-cases/data-lake/getting-started
55
sidebar_position: 1

docs/use-cases/data_lake/guides/accelerating-analytics.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ keywords: ['data lake', 'lakehouse', 'MergeTree', 'accelerate', 'analytics', 'in
1111
doc_type: 'guide'
1212
---
1313

14-
In the [previous section](/use-cases/data-lake/getting-started/connecting-catalogs), you connected ClickHouse to a data catalog and queried open table formats directly. While querying data in place is convenient, lakehouse formats are not optimized for the low-latency, high-concurrency workloads that power dashboards and operational reporting. For these use cases, loading data into ClickHouse's [MergeTree](/engines/table-engines/mergetree-family/mergetree) engine delivers dramatically better performance.
14+
In the [previous section](/use-cases/data-lake/getting-started/connecting-catalogs), you connected ClickHouse to a data catalog and queried open table formats directly. While querying data in place is convenient, open table formats are not optimized for the low-latency, high-concurrency workloads that power dashboards and operational reporting. For these use cases, loading data into ClickHouse's [MergeTree](/engines/table-engines/mergetree-family/mergetree) engine delivers dramatically better performance.
1515

1616
MergeTree offers several advantages over reading open table formats directly:
1717

@@ -90,7 +90,7 @@ FROM unity.`icebench.single_day_log`
9090
1 row in set. Elapsed: 1.265 sec.
9191
```
9292

93-
## Query over the lakehouse table {#query-lakehouse}
93+
## Query over the data lake table {#query-lakehouse}
9494

9595
Let's run a query that filters logs by thread name and instance type, searches the message text for errors, and groups results by logger:
9696

@@ -163,7 +163,7 @@ ORDER BY (instance_type, thread_name, toStartOfMinute(event_time))
163163

164164
### Insert data from the catalog {#insert-data}
165165

166-
Use `INSERT INTO SELECT` to load the ~300m from the lakehouse table into our ClickHouse table:
166+
Use `INSERT INTO SELECT` to load the ~300m from the data lake table into our ClickHouse table:
167167

168168
```sql
169169
INSERT INTO single_day_log SELECT * FROM icebench.`icebench.single_day_log`

docs/use-cases/data_lake/guides/writing-data.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ keywords: ['data lake', 'lakehouse', 'write', 'iceberg', 'reverse ETL', 'INSERT
1111
doc_type: 'guide'
1212
---
1313

14-
In the previous guides, you queried open table formats in place and loaded data into MergeTree for fast analytics. In many architectures, data also needs to flow in the other direction - from ClickHouse back into lakehouse formats. Two common scenarios drive this:
14+
In the previous guides, you queried open table formats in place and loaded data into MergeTree for fast analytics. In many architectures, data also needs to flow in the other direction - from ClickHouse back into open table formats. Two common scenarios drive this:
1515

1616
- **Offloading to long-term storage** - Data arrives in ClickHouse as a real-time analytics layer, powering dashboards and operational reporting. Once the data ages beyond its real-time window, it can be written out to Iceberg in object storage for durable, cost-effective retention in an interoperable format.
1717
- **Reverse ETL** - Transformations, aggregations, and enrichment performed inside ClickHouse produce derived datasets that downstream tools and other teams need to consume. Writing these results to Iceberg tables makes them available across the broader data ecosystem.

docs/use-cases/data_lake/index.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,18 +3,18 @@ description: 'Use ClickHouse to query, accelerate, and analyze data in open tabl
33
pagination_prev: null
44
pagination_next: null
55
slug: /use-cases/data-lake
6-
title: 'Data Lakehouse'
6+
title: 'Data Lake'
77
keywords: ['data lake', 'lakehouse', 'iceberg', 'delta lake', 'hudi', 'paimon', 'glue', 'unity', 'rest', 'OneLake', 'BigLake']
88
doc_type: 'landing-page'
99
---
1010

11-
ClickHouse integrates with open lakehouse table formats, including [Apache Iceberg](/engines/table-engines/integrations/iceberg), [Delta Lake](/engines/table-engines/integrations/deltalake), [Apache Hudi](/engines/table-engines/integrations/hudi), and [Apache Paimon](/sql-reference/table-functions/paimon). This allows users to connect ClickHouse to data already stored in these formats across object storage, combining the analytical power of ClickHouse with their existing data lake infrastructure.
11+
ClickHouse integrates with open table formats, including [Apache Iceberg](/engines/table-engines/integrations/iceberg), [Delta Lake](/engines/table-engines/integrations/deltalake), [Apache Hudi](/engines/table-engines/integrations/hudi), and [Apache Paimon](/sql-reference/table-functions/paimon). This allows users to connect ClickHouse to data already stored in these formats across object storage, combining the analytical power of ClickHouse with their existing data lake infrastructure.
1212

1313
## Why use ClickHouse with open table formats? {#why-clickhouse-uses-lake-formats}
1414

1515
### Query existing data in place {#querying-data-in-place}
1616

17-
ClickHouse can query open table formats directly in object storage without duplicating data. Organizations standardized on Iceberg, Delta Lake, Hudi, or Paimon can point ClickHouse at existing tables and immediately use its SQL dialect, analytical functions, and efficient native Parquet reader. At the same time, tools like [clickhouse-local](/operations/utilities/clickhouse-local) and [chDB](/chdb) enable exploratory, ad hoc analysis across more than 70 file formats in remote storage, allowing users to interactively explore lakehouse datasets with no infrastructure setup.
17+
ClickHouse can query open table formats directly in object storage without duplicating data. Organizations standardized on Iceberg, Delta Lake, Hudi, or Paimon can point ClickHouse at existing tables and immediately use its SQL dialect, analytical functions, and efficient native Parquet reader. At the same time, tools like [clickhouse-local](/operations/utilities/clickhouse-local) and [chDB](/chdb) enable exploratory, ad hoc analysis across more than 70 file formats in remote storage, allowing users to interactively explore data lake datasets with no infrastructure setup.
1818

1919
Users can achieve this with either direct reading, using [table functions and table engines](/use-cases/data-lake/getting-started/querying-directly), or by [connecting to a data catalogue](/use-cases/data-lake/getting-started/connecting-catalogs).
2020

docs/use-cases/data_lake/support-matrix.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,19 +5,19 @@ slug: /use-cases/data-lake/support-matrix
55
sidebar_position: 3
66
pagination_prev: null
77
pagination_next: null
8-
description: 'Comprehensive support matrices for ClickHouse lakehouse format integrations and data catalog connections.'
8+
description: 'Comprehensive support matrices for ClickHouse open table format integrations and data catalog connections.'
99
keywords: ['data lake', 'lakehouse', 'support', 'iceberg', 'delta lake', 'hudi', 'paimon', 'catalog', 'features']
1010
doc_type: 'reference'
1111
---
1212

1313
import Tabs from '@theme/Tabs';
1414
import TabItem from '@theme/TabItem';
1515

16-
This page provides comprehensive support matrices for ClickHouse's lakehouse integrations. It covers the features available for each lakehouse table format, the catalogs ClickHouse can connect to, and the capabilities supported by each catalog.
16+
This page provides comprehensive support matrices for ClickHouse's data lake integrations. It covers the features available for each open table format, the catalogs ClickHouse can connect to, and the capabilities supported by each catalog.
1717

18-
## Lakehouse format support {#format-support}
18+
## Open table format support {#format-support}
1919

20-
ClickHouse integrates with four lakehouse table formats: [Apache Iceberg](/engines/table-engines/integrations/iceberg), [Delta Lake](/engines/table-engines/integrations/deltalake), [Apache Hudi](/engines/table-engines/integrations/hudi), and [Apache Paimon](/sql-reference/table-functions/paimon). Select a format below to view its support matrix.
20+
ClickHouse integrates with four open table formats: [Apache Iceberg](/engines/table-engines/integrations/iceberg), [Delta Lake](/engines/table-engines/integrations/deltalake), [Apache Hudi](/engines/table-engines/integrations/hudi), and [Apache Paimon](/sql-reference/table-functions/paimon). Select a format below to view its support matrix.
2121

2222
**Legend:** ✅ Supported | ⚠️ Partial / Experimental | ❌ Not supported
2323

0 commit comments

Comments
 (0)