You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[TiSpark](https://github.com/pingcap/tispark) is a thin layer built for running Apache Spark on top of TiDB/TiKV to answer the complex OLAP queries. It takes advantages of both the Spark platform and the distributed TiKV cluster and seamlessly glues to TiDB, the distributed OLTP database, to provide a Hybrid Transactional/Analytical Processing (HTAP) solution to serve as a one-stop solution for both online transactions and analysis.
13
-
14
-
[TiFlash](/tiflash/tiflash-overview.md) is another tool that enables HTAP. Both TiFlash and TiSpark allow the use of multiple hosts to execute OLAP queries on OLTP data. TiFlash stores data in a columnar format, which allows more efficient analytical queries. TiFlash and TiSpark can be used together.
15
-
16
-
## What is TiSpark
8
+
> **Warning:**
9
+
>
10
+
> - TiSpark does not guarantee compatibility with TiDB v7.0.0 and later versions.
11
+
> - TiSpark does not guarantee compatibility with Spark v3.4.0 and later versions.
17
12
18
13
TiSpark depends on the TiKV cluster and the PD cluster. You also need to set up a Spark cluster. This document provides a brief introduction to how to setup and use TiSpark. It requires some basic knowledge of Apache Spark. For more information, see [Apache Spark website](https://spark.apache.org/docs/latest/index.html).
19
14
@@ -33,6 +28,16 @@ Also, TiSpark supports distributed writes to TiKV. Compared with writes to TiDB
33
28
>
34
29
> Because TiSpark accesses TiKV directly, the access control mechanisms used by TiDB Server are not applicable to TiSpark. Since TiSpark v2.5.0, TiSpark supports user authentication and authorization, for more information, see [Security](/tispark-overview.md#security).
35
30
31
+
The following diagram shows the architecture of TiSpark.
[TiSpark](https://github.com/pingcap/tispark) is a thin layer built for running Apache Spark on top of TiDB/TiKV to answer complex OLAP queries. It takes advantage of both the Spark platform and the distributed TiKV cluster and seamlessly integrates with TiDB, the distributed OLTP database, to provide a Hybrid Transactional/Analytical Processing (HTAP) solution to serve as a one-stop solution for both online transactions and analysis.
38
+
39
+
[TiFlash](/tiflash/tiflash-overview.md) is another tool that enables HTAP. Both TiFlash and TiSpark allow the use of multiple hosts to execute OLAP queries on OLTP data. TiFlash stores data in a columnar format, which allows more efficient analytical queries. TiFlash and TiSpark can be used together.
40
+
36
41
## Requirements
37
42
38
43
+ TiSpark supports Spark >= 2.3.
@@ -101,11 +106,6 @@ You can choose TiSpark version according to your TiDB and Spark version.
101
106
102
107
TiSpark 2.4.4, 2.5.3, 3.0.3, 3.1.7, and 3.2.3 are the latest stable versions and are highly recommended.
103
108
104
-
> **Note:**
105
-
>
106
-
> TiSpark does not guarantee compatibility with TiDB v7.0.0 and later versions.
107
-
> TiSpark does not guarantee compatibility with Spark v3.4.0 and later versions.
108
-
109
109
### Get TiSpark jar
110
110
111
111
You can get the TiSpark jar using one of the following methods:
0 commit comments