You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/concepts/olap.md
+3-6Lines changed: 3 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,14 +10,11 @@ slug: /concepts/olap
10
10
11
11
[OLAP](https://en.wikipedia.org/wiki/Online_analytical_processing) stands for Online Analytical Processing. It is a broad term that can be looked at from two perspectives: technical and business. At the highest level, you can just read these words backward:
12
12
13
-
Processing
14
-
: Some source data is processed…
13
+
**Processing** some source data is processed…
15
14
16
-
Analytical
17
-
: …to produce some analytical reports and insights…
15
+
**Analytical** …to produce some analytical reports and insights…
18
16
19
-
Online
20
-
: …in real-time.
17
+
**Online** …in real-time.
21
18
22
19
## OLAP from the Business Perspective {#olap-from-the-business-perspective}
Copy file name to clipboardExpand all lines: docs/deployment-modes.md
+4-9Lines changed: 4 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,7 +23,7 @@ This guide explores the four main ways to deploy and use ClickHouse:
23
23
24
24
Each deployment mode has its own strengths and ideal use cases, which we'll explore in detail below.
25
25
26
-
<iframewidth="560"height="315"src="https://www.youtube.com/embed/EOXEW_-r10A?si=6IanDSJlRzN8f9Mo"title="YouTube video player"frameborder="0"allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"referrerpolicy="strict-origin-when-cross-origin"allowfullscreen></iframe>
26
+
<iframewidth="1024"height="576"src="https://www.youtube.com/embed/EOXEW_-r10A?si=6IanDSJlRzN8f9Mo"title="YouTube video player"frameborder="0"allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"referrerpolicy="strict-origin-when-cross-origin"allowfullscreen></iframe>
27
27
28
28
## ClickHouse Server {#clickhouse-server}
29
29
@@ -41,9 +41,7 @@ This deployment mode is the go-to choice for production environments where relia
41
41
42
42
[ClickHouse Cloud](/cloud/overview) is a fully managed version of ClickHouse that removes the operational overhead of running your own deployment. While it maintains all the core capabilities of ClickHouse Server, it enhances the experience with additional features designed to streamline development and operations.
A key advantage of ClickHouse Cloud is its integrated tooling. [ClickPipes](/cloud/get-started/cloud-quick-start#clickpipes) provides a robust data ingestion framework, allowing you to easily connect and stream data from various sources without managing complex ETL pipelines. The platform also offers a dedicated [querying API](/cloud/get-started/query-endpoints), making it significantly easier to build applications.
49
47
@@ -57,8 +55,7 @@ The managed nature of the service means you don't need to worry about updates, b
57
55
58
56
[clickhouse-local](/operations/utilities/clickhouse-local) is a powerful command-line tool that provides the complete functionality of ClickHouse in a standalone executable. It's essentially the same database as ClickHouse Server, but packaged in a way that lets you harness all of ClickHouse's capabilities directly from the command line without running a server instance.
This tool excels at ad-hoc data analysis, particularly when working with local files or data stored in cloud storage services. You can directly query files in various formats (CSV, JSON, Parquet, etc.) using ClickHouse's SQL dialect, making it an excellent choice for quick data exploration or one-off analysis tasks.
64
61
@@ -70,9 +67,7 @@ The combination of remote table functions and access to the local file system ma
70
67
71
68
[chDB](/chdb) is ClickHouse embedded as an in-process database engine,, with Python being the primary implementation, though it's also available for Go, Rust, NodeJS, and Bun. This deployment option brings ClickHouse's powerful OLAP capabilities directly into your application's process, eliminating the need for a separate database installation.
chDB provides seamless integration with your application's ecosystem. In Python, for example, it's optimized to work efficiently with common data science tools like Pandas and Arrow, minimizing data copying overhead through Python memoryview. This makes it particularly valuable for data scientists and analysts who want to leverage ClickHouse's query performance within their existing workflows.
Copy file name to clipboardExpand all lines: docs/guides/inserting-data.md
+2-9Lines changed: 2 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,7 @@ slug: /guides/inserting-data
7
7
---
8
8
9
9
import postgres_inserts from '@site/static/images/guides/postgres-inserts.png';
10
+
import Image from '@theme/IdealImage';
10
11
11
12
## Basic Example {#basic-example}
12
13
@@ -90,15 +91,7 @@ If large batches cannot be inserted, users can delegate batching to ClickHouse u
90
91
91
92
With asynchronous inserts, data is inserted into a buffer first and then written to the database storage later in 3 steps, as illustrated by the diagram below:
import bigquery_1 from '@site/static/images/migrations/bigquery-1.png';
9
+
import Image from '@theme/IdealImage';
9
10
10
11
# BigQuery vs ClickHouse Cloud: Equivalent and different concepts
11
12
12
13
## Resource organization {#resource-organization}
13
14
14
15
The way resources are organized in ClickHouse Cloud is similar to [BigQuery's resource hierarchy](https://cloud.google.com/bigquery/docs/resource-hierarchy). We describe specific differences below based on the following diagram showing the ClickHouse Cloud resource hierarchy:
Similar to BigQuery, organizations are the root nodes in the ClickHouse cloud resource hierarchy. The first user you set up in your ClickHouse Cloud account is automatically assigned to an organization owned by the user. The user may invite additional users to the organization.
21
+
Similar to BigQuery, organizations are the root nodes in the ClickHouse cloud resource hierarchy. The first user you set up in your ClickHouse Cloud account is automatically assigned to an organization owned by the user. The user may invite additional users to the organization.
26
22
27
23
### BigQuery Projects vs ClickHouse Cloud Services {#bigquery-projects-vs-clickhouse-cloud-services}
28
24
29
25
Within organizations, you can create services loosely equivalent to BigQuery projects because stored data in ClickHouse Cloud is associated with a service. There are [several service types available](/cloud/manage/cloud-tiers) in ClickHouse Cloud. Each ClickHouse Cloud service is deployed in a specific region and includes:
30
26
31
-
1. A group of compute nodes (currently, 2 nodes for a Development tier service and 3 for a Production tier service). For these nodes, ClickHouse Cloud [supports vertical and horizontal scaling](/manage/scaling#how-scaling-works-in-clickhouse-cloud), both manually and automatically.
27
+
1. A group of compute nodes (currently, 2 nodes for a Development tier service and 3 for a Production tier service). For these nodes, ClickHouse Cloud [supports vertical and horizontal scaling](/manage/scaling#how-scaling-works-in-clickhouse-cloud), both manually and automatically.
32
28
2. An object storage folder where the service stores all the data.
33
29
3. An endpoint (or multiple endpoints created via ClickHouse Cloud UI console) - a service URL that you use to connect to the service (for example, `https://dv2fzne24g.us-east-1.aws.clickhouse.cloud:8443`)
34
30
35
31
### BigQuery Datasets vs ClickHouse Cloud Databases {#bigquery-datasets-vs-clickhouse-cloud-databases}
36
32
37
-
ClickHouse logically groups tables into databases. Like BigQuery datasets, ClickHouse databases are logical containers that organize and control access to table data.
33
+
ClickHouse logically groups tables into databases. Like BigQuery datasets, ClickHouse databases are logical containers that organize and control access to table data.
38
34
39
35
### BigQuery Folders {#bigquery-folders}
40
36
@@ -77,13 +73,13 @@ ClickHouse offers more granular precision with respect to numerics. For example,
When presented with multiple options for ClickHouse types, consider the actual range of the data and pick the lowest required. Also, consider utilizing [appropriate codecs](https://clickhouse.com/blog/optimize-clickhouse-codecs-compression-schema) for further compression.
76
+
When presented with multiple options for ClickHouse types, consider the actual range of the data and pick the lowest required. Also, consider utilizing [appropriate codecs](https://clickhouse.com/blog/optimize-clickhouse-codecs-compression-schema) for further compression.
### Primary and Foreign keys and Primary index {#primary-and-foreign-keys-and-primary-index}
85
81
86
-
In BigQuery, a table can have [primary key and foreign key constraints](https://cloud.google.com/bigquery/docs/information-schema-table-constraints). Typically, primary and foreign keys are used in relational databases to ensure data integrity. A primary key value is normally unique for each row and is not `NULL`. Each foreign key value in a row must be present in the primary key column of the primary key table or be `NULL`. In BigQuery, these constraints are not enforced, but the query optimizer may use this information to optimize queries better.
82
+
In BigQuery, a table can have [primary key and foreign key constraints](https://cloud.google.com/bigquery/docs/information-schema-table-constraints). Typically, primary and foreign keys are used in relational databases to ensure data integrity. A primary key value is normally unique for each row and is not `NULL`. Each foreign key value in a row must be present in the primary key column of the primary key table or be `NULL`. In BigQuery, these constraints are not enforced, but the query optimizer may use this information to optimize queries better.
87
83
88
84
In ClickHouse, a table can also have a primary key. Like BigQuery, ClickHouse doesn't enforce uniqueness for a table's primary key column values. Unlike BigQuery, a table's data is stored on disk [ordered](/guides/best-practices/sparse-primary-indexes#optimal-compression-ratio-of-data-files) by the primary key column(s). The query optimizer utilizes this sort order to prevent resorting, to minimize memory usage for joins, and to enable short-circuiting for limit clauses. Unlike BigQuery, ClickHouse automatically creates [a (sparse) primary index](/guides/best-practices/sparse-primary-indexes#an-index-design-for-massive-data-scales) based on the primary key column values. This index is used to speed up all queries that contain filters on the primary key columns. ClickHouse currently doesn't support foreign key constraints.
89
85
@@ -102,7 +98,7 @@ In addition to the primary index created from the values of a table's primary ke
102
98
103
99
## Search indexes {#search-indexes}
104
100
105
-
Similar to [search indexes](https://cloud.google.com/bigquery/docs/search-index) in BigQuery, [full-text indexes](/engines/table-engines/mergetree-family/invertedindexes) can be created for ClickHouse tables on columns with string values.
101
+
Similar to [search indexes](https://cloud.google.com/bigquery/docs/search-index) in BigQuery, [full-text indexes](/engines/table-engines/mergetree-family/invertedindexes) can be created for ClickHouse tables on columns with string values.
106
102
107
103
## Vector indexes {#vector-indexes}
108
104
@@ -120,7 +116,7 @@ In ClickHouse, data is automatically [clustered on disk](/guides/best-practices/
120
116
121
117
## Materialized views {#materialized-views}
122
118
123
-
Both BigQuery and ClickHouse support materialized views – precomputed results based on a transformation query's result against a base table for increased performance and efficiency.
119
+
Both BigQuery and ClickHouse support materialized views – precomputed results based on a transformation query's result against a base table for increased performance and efficiency.
@@ -160,9 +156,9 @@ ClickHouse provides standard SQL with many extensions and improvements that make
160
156
161
157
Compared to BigQuery's 8 array functions, ClickHouse has over 80 [built-in array functions](/sql-reference/functions/array-functions) for modeling and solving a wide range of problems elegantly and simply.
162
158
163
-
A typical design pattern in ClickHouse is to use the [`groupArray`](/sql-reference/aggregate-functions/reference/grouparray) aggregate function to (temporarily) transform specific row values of a table into an array. This then can be conveniently processed via array functions, and the result can be converted back into individual table rows via [`arrayJoin`](/sql-reference/functions/array-join) aggregate function.
159
+
A typical design pattern in ClickHouse is to use the [`groupArray`](/sql-reference/aggregate-functions/reference/grouparray) aggregate function to (temporarily) transform specific row values of a table into an array. This then can be conveniently processed via array functions, and the result can be converted back into individual table rows via [`arrayJoin`](/sql-reference/functions/array-join) aggregate function.
164
160
165
-
Because ClickHouse SQL supports [higher order lambda functions](/sql-reference/functions/overview#arrow-operator-and-lambda), many advanced array operations can be achieved by simply calling one of the higher order built-in array functions, instead of temporarily converting arrays back to tables, as it is often [required](https://cloud.google.com/bigquery/docs/arrays) in BigQuery, e.g. for [filtering](https://cloud.google.com/bigquery/docs/arrays#filtering_arrays) or [zipping](https://cloud.google.com/bigquery/docs/arrays#zipping_arrays) arrays. In ClickHouse these operations are just a simple function call of the higher order functions [`arrayFilter`](/sql-reference/functions/array-functions#arrayfilterfunc-arr1-), and [`arrayZip`](/sql-reference/functions/array-functions#arrayzip), respectively.
161
+
Because ClickHouse SQL supports [higher order lambda functions](/sql-reference/functions/overview#arrow-operator-and-lambda), many advanced array operations can be achieved by simply calling one of the higher order built-in array functions, instead of temporarily converting arrays back to tables, as it is often [required](https://cloud.google.com/bigquery/docs/arrays) in BigQuery, e.g. for [filtering](https://cloud.google.com/bigquery/docs/arrays#filtering_arrays) or [zipping](https://cloud.google.com/bigquery/docs/arrays#zipping_arrays) arrays. In ClickHouse these operations are just a simple function call of the higher order functions [`arrayFilter`](/sql-reference/functions/array-functions#arrayfilterfunc-arr1-), and [`arrayZip`](/sql-reference/functions/array-functions#arrayzip), respectively.
166
162
167
163
In the following, we provide a mapping of array operations from BigQuery to ClickHouse:
Requires temporarily converting arrays back to tables via [`UNNEST`](https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#unnest_operator) operator
325
+
Requires temporarily converting arrays back to tables via [`UNNEST`](https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#unnest_operator) operator
0 commit comments