Skip to content

Commit 1677952

Browse files
committed
more images
1 parent b07428f commit 1677952

File tree

3 files changed

+33
-105
lines changed

3 files changed

+33
-105
lines changed

docs/migrations/bigquery/equivalent-concepts.md

Lines changed: 12 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -6,35 +6,31 @@ keywords: ['migrate', 'migration', 'migrating', 'data', 'etl', 'elt', 'BigQuery'
66
---
77

88
import bigquery_1 from '@site/static/images/migrations/bigquery-1.png';
9+
import Image from '@theme/IdealImage';
910

1011
# BigQuery vs ClickHouse Cloud: Equivalent and different concepts
1112

1213
## Resource organization {#resource-organization}
1314

1415
The way resources are organized in ClickHouse Cloud is similar to [BigQuery's resource hierarchy](https://cloud.google.com/bigquery/docs/resource-hierarchy). We describe specific differences below based on the following diagram showing the ClickHouse Cloud resource hierarchy:
1516

16-
<img src={bigquery_1}
17-
class="image"
18-
alt="NEEDS ALT"
19-
style={{width: '600px'}} />
20-
21-
<br />
17+
<Image img={bigquery_1} size="md" alt="Resource organizations"/>
2218

2319
### Organizations {#organizations}
2420

25-
Similar to BigQuery, organizations are the root nodes in the ClickHouse cloud resource hierarchy. The first user you set up in your ClickHouse Cloud account is automatically assigned to an organization owned by the user. The user may invite additional users to the organization.
21+
Similar to BigQuery, organizations are the root nodes in the ClickHouse cloud resource hierarchy. The first user you set up in your ClickHouse Cloud account is automatically assigned to an organization owned by the user. The user may invite additional users to the organization.
2622

2723
### BigQuery Projects vs ClickHouse Cloud Services {#bigquery-projects-vs-clickhouse-cloud-services}
2824

2925
Within organizations, you can create services loosely equivalent to BigQuery projects because stored data in ClickHouse Cloud is associated with a service. There are [several service types available](/cloud/manage/cloud-tiers) in ClickHouse Cloud. Each ClickHouse Cloud service is deployed in a specific region and includes:
3026

31-
1. A group of compute nodes (currently, 2 nodes for a Development tier service and 3 for a Production tier service). For these nodes, ClickHouse Cloud [supports vertical and horizontal scaling](/manage/scaling#how-scaling-works-in-clickhouse-cloud), both manually and automatically.
27+
1. A group of compute nodes (currently, 2 nodes for a Development tier service and 3 for a Production tier service). For these nodes, ClickHouse Cloud [supports vertical and horizontal scaling](/manage/scaling#how-scaling-works-in-clickhouse-cloud), both manually and automatically.
3228
2. An object storage folder where the service stores all the data.
3329
3. An endpoint (or multiple endpoints created via ClickHouse Cloud UI console) - a service URL that you use to connect to the service (for example, `https://dv2fzne24g.us-east-1.aws.clickhouse.cloud:8443`)
3430

3531
### BigQuery Datasets vs ClickHouse Cloud Databases {#bigquery-datasets-vs-clickhouse-cloud-databases}
3632

37-
ClickHouse logically groups tables into databases. Like BigQuery datasets, ClickHouse databases are logical containers that organize and control access to table data.
33+
ClickHouse logically groups tables into databases. Like BigQuery datasets, ClickHouse databases are logical containers that organize and control access to table data.
3834

3935
### BigQuery Folders {#bigquery-folders}
4036

@@ -77,13 +73,13 @@ ClickHouse offers more granular precision with respect to numerics. For example,
7773
| [TIME](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#time_type) | [DateTime64](/sql-reference/data-types/datetime64) |
7874
| [TIMESTAMP](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#timestamp_type) | [DateTime64](/sql-reference/data-types/datetime64) |
7975

80-
When presented with multiple options for ClickHouse types, consider the actual range of the data and pick the lowest required. Also, consider utilizing [appropriate codecs](https://clickhouse.com/blog/optimize-clickhouse-codecs-compression-schema) for further compression.
76+
When presented with multiple options for ClickHouse types, consider the actual range of the data and pick the lowest required. Also, consider utilizing [appropriate codecs](https://clickhouse.com/blog/optimize-clickhouse-codecs-compression-schema) for further compression.
8177

8278
## Query acceleration techniques {#query-acceleration-techniques}
8379

8480
### Primary and Foreign keys and Primary index {#primary-and-foreign-keys-and-primary-index}
8581

86-
In BigQuery, a table can have [primary key and foreign key constraints](https://cloud.google.com/bigquery/docs/information-schema-table-constraints). Typically, primary and foreign keys are used in relational databases to ensure data integrity. A primary key value is normally unique for each row and is not `NULL`. Each foreign key value in a row must be present in the primary key column of the primary key table or be `NULL`. In BigQuery, these constraints are not enforced, but the query optimizer may use this information to optimize queries better.
82+
In BigQuery, a table can have [primary key and foreign key constraints](https://cloud.google.com/bigquery/docs/information-schema-table-constraints). Typically, primary and foreign keys are used in relational databases to ensure data integrity. A primary key value is normally unique for each row and is not `NULL`. Each foreign key value in a row must be present in the primary key column of the primary key table or be `NULL`. In BigQuery, these constraints are not enforced, but the query optimizer may use this information to optimize queries better.
8783

8884
In ClickHouse, a table can also have a primary key. Like BigQuery, ClickHouse doesn't enforce uniqueness for a table's primary key column values. Unlike BigQuery, a table's data is stored on disk [ordered](/guides/best-practices/sparse-primary-indexes#optimal-compression-ratio-of-data-files) by the primary key column(s). The query optimizer utilizes this sort order to prevent resorting, to minimize memory usage for joins, and to enable short-circuiting for limit clauses. Unlike BigQuery, ClickHouse automatically creates [a (sparse) primary index](/guides/best-practices/sparse-primary-indexes#an-index-design-for-massive-data-scales) based on the primary key column values. This index is used to speed up all queries that contain filters on the primary key columns. ClickHouse currently doesn't support foreign key constraints.
8985

@@ -102,7 +98,7 @@ In addition to the primary index created from the values of a table's primary ke
10298

10399
## Search indexes {#search-indexes}
104100

105-
Similar to [search indexes](https://cloud.google.com/bigquery/docs/search-index) in BigQuery, [full-text indexes](/engines/table-engines/mergetree-family/invertedindexes) can be created for ClickHouse tables on columns with string values.
101+
Similar to [search indexes](https://cloud.google.com/bigquery/docs/search-index) in BigQuery, [full-text indexes](/engines/table-engines/mergetree-family/invertedindexes) can be created for ClickHouse tables on columns with string values.
106102

107103
## Vector indexes {#vector-indexes}
108104

@@ -120,7 +116,7 @@ In ClickHouse, data is automatically [clustered on disk](/guides/best-practices/
120116

121117
## Materialized views {#materialized-views}
122118

123-
Both BigQuery and ClickHouse support materialized views – precomputed results based on a transformation query's result against a base table for increased performance and efficiency.
119+
Both BigQuery and ClickHouse support materialized views – precomputed results based on a transformation query's result against a base table for increased performance and efficiency.
124120

125121
## Querying materialized views {#querying-materialized-views}
126122

@@ -160,9 +156,9 @@ ClickHouse provides standard SQL with many extensions and improvements that make
160156

161157
Compared to BigQuery's 8 array functions, ClickHouse has over 80 [built-in array functions](/sql-reference/functions/array-functions) for modeling and solving a wide range of problems elegantly and simply.
162158

163-
A typical design pattern in ClickHouse is to use the [`groupArray`](/sql-reference/aggregate-functions/reference/grouparray) aggregate function to (temporarily) transform specific row values of a table into an array. This then can be conveniently processed via array functions, and the result can be converted back into individual table rows via [`arrayJoin`](/sql-reference/functions/array-join) aggregate function.
159+
A typical design pattern in ClickHouse is to use the [`groupArray`](/sql-reference/aggregate-functions/reference/grouparray) aggregate function to (temporarily) transform specific row values of a table into an array. This then can be conveniently processed via array functions, and the result can be converted back into individual table rows via [`arrayJoin`](/sql-reference/functions/array-join) aggregate function.
164160

165-
Because ClickHouse SQL supports [higher order lambda functions](/sql-reference/functions/overview#arrow-operator-and-lambda), many advanced array operations can be achieved by simply calling one of the higher order built-in array functions, instead of temporarily converting arrays back to tables, as it is often [required](https://cloud.google.com/bigquery/docs/arrays) in BigQuery, e.g. for [filtering](https://cloud.google.com/bigquery/docs/arrays#filtering_arrays) or [zipping](https://cloud.google.com/bigquery/docs/arrays#zipping_arrays) arrays. In ClickHouse these operations are just a simple function call of the higher order functions [`arrayFilter`](/sql-reference/functions/array-functions#arrayfilterfunc-arr1-), and [`arrayZip`](/sql-reference/functions/array-functions#arrayzip), respectively.
161+
Because ClickHouse SQL supports [higher order lambda functions](/sql-reference/functions/overview#arrow-operator-and-lambda), many advanced array operations can be achieved by simply calling one of the higher order built-in array functions, instead of temporarily converting arrays back to tables, as it is often [required](https://cloud.google.com/bigquery/docs/arrays) in BigQuery, e.g. for [filtering](https://cloud.google.com/bigquery/docs/arrays#filtering_arrays) or [zipping](https://cloud.google.com/bigquery/docs/arrays#zipping_arrays) arrays. In ClickHouse these operations are just a simple function call of the higher order functions [`arrayFilter`](/sql-reference/functions/array-functions#arrayfilterfunc-arr1-), and [`arrayZip`](/sql-reference/functions/array-functions#arrayzip), respectively.
166162

167163
In the following, we provide a mapping of array operations from BigQuery to ClickHouse:
168164

@@ -326,7 +322,7 @@ Query id: b324c11f-655b-479f-9337-f4d34fd02190
326322

327323
_BigQuery_
328324

329-
Requires temporarily converting arrays back to tables via [`UNNEST`](https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#unnest_operator) operator
325+
Requires temporarily converting arrays back to tables via [`UNNEST`](https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#unnest_operator) operator
330326

331327
```sql
332328
WITH Sequences AS

0 commit comments

Comments
 (0)