more images

gingerwizard · gingerwizard · commit e198c8af3423 · 2025-03-17T17:43:05.000Z
diff --git a/docs/integrations/data-ingestion/dbms/dynamodb/index.md b/docs/integrations/data-ingestion/dbms/dynamodb/index.md
@@ -12,6 +12,7 @@ import ExperimentalBadge from '@theme/badges/ExperimentalBadge';
 import dynamodb_kinesis_stream from '@site/static/images/integrations/data-ingestion/dbms/dynamodb/dynamodb-kinesis-stream.png';
 import dynamodb_s3_export from '@site/static/images/integrations/data-ingestion/dbms/dynamodb/dynamodb-s3-export.png';
 import dynamodb_map_columns from '@site/static/images/integrations/data-ingestion/dbms/dynamodb/dynamodb-map-columns.png';
+import Image from '@theme/IdealImage';
 
 # CDC from DynamoDB to ClickHouse
 
@@ -31,14 +32,14 @@ Data will be ingested into a `ReplacingMergeTree`. This table engine is commonly
 First, you will want to enable a Kinesis stream on your DynamoDB table to capture changes in real-time. We want to do this before we create the snapshot to avoid missing any data.
 Find the AWS guide located [here](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/kds.html).
 
-<img src={dynamodb_kinesis_stream} alt="DynamoDB Kinesis Stream"/>
+<Image img={dynamodb_kinesis_stream} size="lg" alt="DynamoDB Kinesis Stream" />
 
 ## 2. Create the snapshot {#2-create-the-snapshot}
 
 Next, we will create a snapshot of the DynamoDB table. This can be achieved through an AWS export to S3. Find the AWS guide located [here](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/S3DataExport.HowItWorks.html).
 **You will want to do a "Full export" in the DynamoDB JSON format.**
 
-<img src={dynamodb_s3_export} alt="DynamoDB S3 Export"/>
+<Image img={dynamodb_s3_export} size="lg" alt="DynamoDB S3 Export"/>
 
 ## 3. Load the snapshot into ClickHouse {#3-load-the-snapshot-into-clickhouse}
 
@@ -91,7 +92,7 @@ CREATE TABLE IF NOT EXISTS "default"."destination" (
     "first_name" String,
     "age" Int8,
     "version" Int64
-) 
+)
 ENGINE ReplacingMergeTree("version")
 ORDER BY id;
 ```
@@ -128,7 +129,7 @@ Now we can set up the Kinesis ClickPipe to capture real-time changes from the Ki
   - `ApproximateCreationDateTime`: `version`
   - Map other fields to the appropriate destination columns as shown below
 
-<img src={dynamodb_map_columns} alt="DynamoDB Map Columns"/>
+<Image img={dynamodb_map_columns} size="lg" alt="DynamoDB Map Columns"/>
 
 ## 5. Cleanup (optional) {#5-cleanup-optional}
 
@@ -139,4 +140,3 @@ DROP TABLE IF EXISTS "default"."snapshot";
 DROP TABLE IF EXISTS "default"."snapshot_clickpipes_error";
 DROP VIEW IF EXISTS "default"."snapshot_mv";
 ```
-
diff --git a/docs/integrations/data-ingestion/dbms/postgresql/postgres-vs-clickhouse.md b/docs/integrations/data-ingestion/dbms/postgresql/postgres-vs-clickhouse.md
@@ -6,6 +6,7 @@ description: 'Page which explores the similarities and differences between Postg
 ---
 
 import postgresReplicas from '@site/static/images/integrations/data-ingestion/dbms/postgres-replicas.png';
+import Image from '@theme/IdealImage';
 
 ## Postgres vs ClickHouse: Equivalent and different concepts {#postgres-vs-clickhouse-equivalent-and-different-concepts}
 
@@ -33,22 +34,15 @@ ClickHouse uses ClickHouse Keeper (C++ ZooKeeper implementation, ZooKeeper can a
 
 The replication process in ClickHouse (1) starts when data is inserted into any replica. This data, in its raw insert form, is (2) written to disk along with its checksums. Once written, the replica (3) attempts to register this new data part in Keeper by allocating a unique block number and logging the new part's details. Other replicas, upon (4) detecting new entries in the replication log, (5) download the corresponding data part via an internal HTTP protocol, verifying it against the checksums listed in ZooKeeper. This method ensures that all replicas eventually hold consistent and up-to-date data despite varying processing speeds or potential delays. Moreover, the system is capable of handling multiple operations concurrently, optimizing data management processes, and allowing for system scalability and robustness against hardware discrepancies.
 
-<br />
-
-<img src={postgresReplicas}    
-  class="image"
-  alt="NEEDS ALT"
-  style={{width: '500px'}} />
-
-<br />
+<Image img={postgresReplicas} size="md" alt="Eventual consistency"/>
 
-Note that ClickHouse Cloud uses a [cloud-optimized replication mechanism](https://clickhouse.com/blog/clickhouse-cloud-boosts-performance-with-sharedmergetree-and-lightweight-updates) adapted to its separation of storage and compute architecture. By storing data in shared object storage, data is automatically available for all compute nodes without the need to physically replicate data between nodes. Instead, Keeper is used to only share metadata (which data exists where in object storage) between compute nodes. 
+Note that ClickHouse Cloud uses a [cloud-optimized replication mechanism](https://clickhouse.com/blog/clickhouse-cloud-boosts-performance-with-sharedmergetree-and-lightweight-updates) adapted to its separation of storage and compute architecture. By storing data in shared object storage, data is automatically available for all compute nodes without the need to physically replicate data between nodes. Instead, Keeper is used to only share metadata (which data exists where in object storage) between compute nodes.
 
 PostgreSQL employs a different replication strategy compared to ClickHouse, primarily using streaming replication, which involves a primary replica model where data is continuously streamed from the primary to one or more replica nodes. This type of replication ensures near real-time consistency and is synchronous or asynchronous, giving administrators control over the balance between availability and consistency. Unlike ClickHouse, PostgreSQL relies on a WAL (Write-Ahead Logging) with logical replication and decoding to stream data objects and changes between nodes. This approach in PostgreSQL is more straightforward but might not offer the same level of scalability and fault tolerance in highly distributed environments that ClickHouse achieves through its complex use of Keeper for distributed operations coordination and eventual consistency.
 
 ## User implications {#user-implications}
 
-In ClickHouse, the possibility of dirty reads - where users can write data to one replica and then read potentially unreplicated data from another—arises from its eventually consistent replication model managed via Keeper. This model emphasizes performance and scalability across distributed systems, allowing replicas to operate independently and sync asynchronously. As a result, newly inserted data might not be immediately visible across all replicas, depending on the replication lag and the time it takes for changes to propagate through the system. 
+In ClickHouse, the possibility of dirty reads - where users can write data to one replica and then read potentially unreplicated data from another—arises from its eventually consistent replication model managed via Keeper. This model emphasizes performance and scalability across distributed systems, allowing replicas to operate independently and sync asynchronously. As a result, newly inserted data might not be immediately visible across all replicas, depending on the replication lag and the time it takes for changes to propagate through the system.
 
 Conversely, PostgreSQL's streaming replication model typically can prevent dirty reads by employing synchronous replication options where the primary waits for at least one replica to confirm the receipt of data before committing transactions. This ensures that once a transaction is committed, a guarantee exists that the data is available in another replica. In the event of primary failure, the replica will ensure queries see the committed data, thereby maintaining a stricter level of consistency.
 
@@ -88,27 +82,27 @@ In this case, users should ensure consistent node routing is performed based on
 
 ## Sequential consistency {#sequential-consistency}
 
-In exceptional cases, users may need sequential consistency. 
+In exceptional cases, users may need sequential consistency.
 
-Sequential consistency in databases is where the operations on a database appear to be executed in some sequential order, and this order is consistent across all processes interacting with the database. This means that every operation appears to take effect instantaneously between its invocation and completion, and there is a single, agreed-upon order in which all operations are observed by any process. 
+Sequential consistency in databases is where the operations on a database appear to be executed in some sequential order, and this order is consistent across all processes interacting with the database. This means that every operation appears to take effect instantaneously between its invocation and completion, and there is a single, agreed-upon order in which all operations are observed by any process.
 
 From a user's perspective this typically manifests itself as the need to write data into ClickHouse and when reading data, to guarantee that the latest inserted rows are returned.
 This can be achieved in several ways (in order of preference):
 
 1. **Read/Write to the same node** - If you are using native protocol, or a [session to do your write/read via HTTP](/interfaces/http#default-database), you should then be connected to the same replica: in this scenario you're reading directly from the node where you're writing, then your read will always be consistent.
-1. **Sync replicas manually** - If you write to one replica and read from another, you can use issue `SYSTEM SYNC REPLICA LIGHTWEIGHT` prior to reading. 
+1. **Sync replicas manually** - If you write to one replica and read from another, you can use issue `SYSTEM SYNC REPLICA LIGHTWEIGHT` prior to reading.
 1. **Enable sequential consistency** - via the query setting [`select_sequential_consistency = 1`](/operations/settings/settings#select_sequential_consistency). In OSS, the setting `insert_quorum = 'auto'` must also be specified.
 
 <br />
 
 See [here](/cloud/reference/shared-merge-tree#consistency) for further details on enabling these settings.
 
-> Use of sequential consistency will place a greater load on ClickHouse Keeper.  The result can 
+> Use of sequential consistency will place a greater load on ClickHouse Keeper.  The result can
 mean slower inserts and reads. SharedMergeTree, used in ClickHouse Cloud as the main table engine, sequential consistency [incurs less overhead and will scale better](/cloud/reference/shared-merge-tree#consistency). OSS users should use this approach cautiously and measure Keeper load.
 
 ## Transactional (ACID) support {#transactional-acid-support}
 
-Users migrating from PostgreSQL may be used to its robust support for ACID (Atomicity, Consistency, Isolation, Durability) properties, making it a reliable choice for transactional databases. Atomicity in PostgreSQL ensures that each transaction is treated as a single unit, which either completely succeeds or is entirely rolled back, preventing partial updates. Consistency is maintained by enforcing constraints, triggers, and rules that guarantee that all database transactions lead to a valid state. Isolation levels, from Read Committed to Serializable, are supported in PostgreSQL, allowing fine-tuned control over the visibility of changes made by concurrent transactions. Lastly, Durability is achieved through write-ahead logging (WAL), ensuring that once a transaction is committed, it remains so even in the event of a system failure. 
+Users migrating from PostgreSQL may be used to its robust support for ACID (Atomicity, Consistency, Isolation, Durability) properties, making it a reliable choice for transactional databases. Atomicity in PostgreSQL ensures that each transaction is treated as a single unit, which either completely succeeds or is entirely rolled back, preventing partial updates. Consistency is maintained by enforcing constraints, triggers, and rules that guarantee that all database transactions lead to a valid state. Isolation levels, from Read Committed to Serializable, are supported in PostgreSQL, allowing fine-tuned control over the visibility of changes made by concurrent transactions. Lastly, Durability is achieved through write-ahead logging (WAL), ensuring that once a transaction is committed, it remains so even in the event of a system failure.
 
 These properties are common for OLTP databases that act as a source of truth.
 
@@ -125,6 +119,3 @@ PeerDB is now available natively in ClickHouse Cloud - Blazing-fast Postgres to
 [PeerDB](https://www.peerdb.io/) enables you to seamlessly replicate data from Postgres to ClickHouse. You can use this tool for
 1. continuous replication using CDC, allowing Postgres and ClickHouse to coexist—Postgres for OLTP and ClickHouse for OLAP; and
 2. migrating from Postgres to ClickHouse.
-
-
-
diff --git a/docs/integrations/data-ingestion/redshift/index.md b/docs/integrations/data-ingestion/redshift/index.md
@@ -11,6 +11,7 @@ import pull from '@site/static/images/integrations/data-ingestion/redshift/pull.
 import pivot from '@site/static/images/integrations/data-ingestion/redshift/pivot.png';
 import s3_1 from '@site/static/images/integrations/data-ingestion/redshift/s3-1.png';
 import s3_2 from '@site/static/images/integrations/data-ingestion/redshift/s3-2.png';
+import Image from '@theme/IdealImage';
 
 # Migrating Data from Redshift to ClickHouse
 
@@ -34,7 +35,7 @@ import s3_2 from '@site/static/images/integrations/data-ingestion/redshift/s3-2.
 
 [Amazon Redshift](https://aws.amazon.com/redshift/) is a popular cloud data warehousing solution that is part of the Amazon Web Services offerings. This guide presents different approaches to migrating data from a Redshift instance to ClickHouse. We will cover three options:
 
-<img src={redshiftToClickhouse} class="image" alt="Redshift to ClickHouse Migration Options"/>
+<Image img={redshiftToClickhouse} size="lg" alt="Redshift to ClickHouse Migration Options" background="white"/>
 
 From the ClickHouse instance standpoint, you can either:
 
@@ -53,8 +54,7 @@ We used Redshift as a data source in this tutorial. However, the migration appro
 
 In the push scenario, the idea is to leverage a third-party tool or service (either custom code or an [ETL/ELT](https://en.wikipedia.org/wiki/Extract,_transform,_load#ETL_vs._ELT)) to send your data to your ClickHouse instance. For example, you can use a software like [Airbyte](https://www.airbyte.com/) to move data between your Redshift instance (as a source) and ClickHouse as a destination ([see our integration guide for Airbyte](/integrations/data-ingestion/etl-tools/airbyte-and-clickhouse.md))
 
-
-<img src={push} class="image" alt="PUSH Redshift to ClickHouse"/>
+<Image img={push} size="lg" alt="PUSH Redshift to ClickHouse" background="white"/>
 
 ### Pros {#pros}
 
@@ -72,8 +72,7 @@ In the push scenario, the idea is to leverage a third-party tool or service (eit
 
 In the pull scenario, the idea is to leverage the ClickHouse JDBC Bridge to connect to a Redshift cluster directly from a ClickHouse instance and perform `INSERT INTO ... SELECT` queries:
 
-
-<img src={pull} class="image" alt="PULL from Redshift to ClickHouse"/>
+<Image img={pull} size="lg" alt="PULL from Redshift to ClickHouse" background="white"/>
 
 ### Pros {#pros-1}
 
@@ -197,7 +196,7 @@ If you are using ClickHouse Cloud, you will need to run your ClickHouse JDBC Bri
 
 In this scenario, we export data to S3 in an intermediary pivot format and, in a second step, load the data from S3 into ClickHouse.
 
-<img src={pivot} class="image" alt="PIVOT from Redshift using S3"/>
+<Image img={pivot} size="lg" alt="PIVOT from Redshift using S3" background="white"/>
 
 ### Pros {#pros-2}
 
@@ -214,11 +213,11 @@ In this scenario, we export data to S3 in an intermediary pivot format and, in a
 
 1. Using Redshift's [UNLOAD](https://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html) feature, export the data into a an existing private S3 bucket:
 
-    <img src={s3_1} class="image" alt="UNLOAD from Redshift to S3"/>
+    <Image img={s3_1} size="md" alt="UNLOAD from Redshift to S3" background="white"/>
 
     It will generate part files containing the raw data in S3
 
-    <img src={s3_2} class="image" alt="Data in S3"/>
+    <Image img={s3_2} size="md" alt="Data in S3" background="white"/>
 
 2. Create the table in ClickHouse:
 
@@ -261,4 +260,3 @@ In this scenario, we export data to S3 in an intermediary pivot format and, in a
 :::note
 This example used CSV as the pivot format. However, for production workloads we recommend Apache Parquet as the best option for large migrations since it comes with compression and can save some storage costs while reducing transfer times. (By default, each row group is compressed using SNAPPY). ClickHouse also leverages Parquet's column orientation to speed up data ingestion.
 :::
-
diff --git a/docs/migrations/postgres/data-modeling-techniques.md b/docs/migrations/postgres/data-modeling-techniques.md
@@ -7,6 +7,7 @@ keywords: ['postgres', 'postgresql', 'migrate', 'migration', 'data modeling']
 
 import postgres_partitions from '@site/static/images/migrations/postgres-partitions.png';
 import postgres_projections from '@site/static/images/migrations/postgres-projections.png';
+import Image from '@theme/IdealImage';
 
 > This is **Part 3** of a guide on migrating from PostgreSQL to ClickHouse. This content can be considered introductory, with the aim of helping users deploy an initial functional system that adheres to ClickHouse best practices. It avoids complex topics and will not result in a fully optimized schema; rather, it provides a solid foundation for users to build a production system and base their learning.
 
@@ -18,11 +19,7 @@ Postgres users will be familiar with the concept of table partitioning for enhan
 
 In ClickHouse, partitioning is specified on a table when it is initially defined via the `PARTITION BY` clause. This clause can contain a SQL expression on any columns, the results of which will define which partition a row is sent to.
 
-<br />
-
-<img src={postgres_partitions} class="image" alt="PostgreSQL partitions to ClickHouse partitions" style={{width: '600px'}} />
-
-<br />
+<Image img={postgres_partitions} size="md" alt="PostgreSQL partitions to ClickHouse partitions"/>
 
 The data parts are logically associated with each partition on disk and can be queried in isolation. For the example below, we partition the `posts` table by year using the expression `toYear(CreationDate)`. As rows are inserted into ClickHouse, this expression will be evaluated against each row and routed to the resulting partition if it exists (if the row is the first for a year, the partition will be created).
 
@@ -210,11 +207,7 @@ WHERE UserId = 8592047
 
 Projections are an appealing feature for new users as they are automatically maintained as data is inserted. Furthermore, queries can just be sent to a single table where the projections are exploited where possible to speed up the response time.
 
-<br />
-
-<img src={postgres_projections} class="image" alt="PostgreSQL projections in ClickHouse" style={{width: '600px'}} />
-
-<br />
+<Image img={postgres_projections} size="md" alt="PostgreSQL projections in ClickHouse"/>
 
 This is in contrast to materialized views, where the user has to select the appropriate optimized target table or rewrite their query, depending on the filters. This places greater emphasis on user applications and increases client-side complexity.
 
diff --git a/docs/migrations/postgres/dataset.md b/docs/migrations/postgres/dataset.md
diff --git a/docs/migrations/postgres/designing-schemas.md b/docs/migrations/postgres/designing-schemas.md