Skip to content

Commit 234d004

Browse files
committed
Integrate/ingestr: Add section and category item
1 parent 3a4d5a6 commit 234d004

File tree

3 files changed

+134
-0
lines changed

3 files changed

+134
-0
lines changed

docs/ingest/etl/index.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,11 @@ outlines how to use them effectively. Additionally, see support for {ref}`cdc` s
4848
Apache Flink is a programming framework and distributed processing engine for
4949
stateful computations over unbounded and bounded data streams, written in Java.
5050

51+
- {ref}`ingestr`
52+
53+
ingestr is a command-line application that allows copying data from any
54+
source into any destination database.
55+
5156
- {ref}`kestra`
5257

5358
Kestra is an open-source workflow automation and orchestration toolkit with a rich

docs/integrate/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ grafana/index
3737
hop/index
3838
iceberg/index
3939
influxdb/index
40+
ingestr/index
4041
kafka/index
4142
kestra/index
4243
kinesis/index

docs/integrate/ingestr/index.md

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
(ingestr)=
2+
# ingestr
3+
4+
```{div} .float-right .text-right
5+
<a href="https://github.com/crate/cratedb-examples/actions/workflows/application-ingestr.yml" target="_blank" rel="noopener noreferrer">
6+
<img src="https://img.shields.io/github/actions/workflow/status/crate/cratedb-examples/application-ingestr.yml?branch=main&label=ingestr" loading="lazy" alt="CI status: ingestr"></a>
7+
```
8+
```{div} .clearfix
9+
```
10+
11+
[ingestr] is a command-line application that allows copying data from any
12+
source into any destination database. It supports CrateDB on the source
13+
and the destination side. ingestr uses {ref}`dlt`.
14+
15+
::::{grid}
16+
17+
:::{grid-item}
18+
- **Single command**: ingestr allows copying & ingesting data from any source
19+
to any destination with a single command.
20+
21+
- **Many sources & destinations**: ingestr supports all common source and
22+
destination databases.
23+
24+
- **Incremental Loading**: ingestr supports both full-refresh and
25+
incremental loading modes.
26+
:::
27+
28+
:::{grid-item}
29+
![ingestr in a nutshell](https://github.com/bruin-data/ingestr/blob/main/resources/demo.gif?raw=true){loading=lazy}
30+
:::
31+
32+
::::
33+
34+
35+
## Synopsis
36+
37+
Invoke ingestr for exporting data from CrateDB.
38+
```shell
39+
ingestr ingest \
40+
--source-uri 'crate://crate@localhost:4200/' \
41+
--source-table 'sys.summits' \
42+
--dest-uri 'duckdb:///cratedb.duckdb' \
43+
--dest-table 'dest.summits'
44+
```
45+
46+
Invoke ingestr for loading data into CrateDB.
47+
```shell
48+
ingestr ingest \
49+
--source-uri 'csv://input.csv' \
50+
--source-table 'sample' \
51+
--dest-uri 'cratedb://crate:@localhost:5432/?sslmode=disable' \
52+
--dest-table 'doc.sample'
53+
```
54+
55+
:::{note}
56+
Please note there a subtle differences in the CrateDB source vs. target URL.
57+
While `--source-uri=crate://...` addresses CrateDB's SQLAlchemy dialect,
58+
`--dest-uri=cratedb://...` is effectively a PostgreSQL connection URL
59+
with a protocol schema designating CrateDB. The source adapter uses
60+
CrateDB's HTTP protocol, while the destination adapter uses CrateDB's
61+
PostgreSQL interface.
62+
:::
63+
64+
65+
## Coverage
66+
67+
ingestr supports migration from 20-plus databases, data platforms, analytics
68+
engines, including all [databases supported by SQLAlchemy].
69+
70+
:::{rubric} Databases
71+
:::
72+
Actian Data Platform, Vector, Actian X, Ingres, Amazon Athena, Amazon Redshift,
73+
Amazon S3, Apache Drill, Apache Druid, Apache Hive and Presto, Apache Solr,
74+
Clickhouse, CockroachDB, CrateDB, Databend, Databricks, Denodo, DuckDB, EXASOL DB,
75+
Elasticsearch, Firebird, Firebolt, Google BigQuery, Google Sheets, Greenplum,
76+
HyperSQL (hsqldb), IBM DB2 and Informix, IBM Netezza Performance Server, Impala, InfluxDB,
77+
Kinetica, Microsoft Access, Microsoft SQL Server, MonetDB, MongoDB, MySQL and MariaDB,
78+
OpenGauss, OpenSearch, Oracle, PostgreSQL, Rockset, SAP ASE, SAP HANA,
79+
SAP Sybase SQL Anywhere, Snowflake, SQLite, Teradata Vantage, TiDB, YDB, YugabyteDB.
80+
81+
:::{rubric} Brokers
82+
:::
83+
Amazon Kinesis, Apache Kafka (Amazon MSK, Confluent Kafka, Redpanda, RobustMQ)
84+
85+
:::{rubric} File formats
86+
:::
87+
CSV, JSONL/NDJSON, Parquet
88+
89+
:::{rubric} Object stores
90+
:::
91+
Amazon S3, Google Cloud Storage
92+
93+
:::{rubric} Services
94+
:::
95+
Airtable, Asana, GitHub, Google Ads, Google Analytics, Google Sheets, HubSpot,
96+
Notion, Personio, Salesforce, Slack, Stripe, Zendesk, etc.
97+
98+
99+
## Learn
100+
101+
::::{grid}
102+
103+
:::{grid-item-card} Documentation: ingestr CrateDB source
104+
:link: https://bruin-data.github.io/ingestr/supported-sources/cratedb.html#source
105+
:link-type: url
106+
Documentation about the CrateDB source adapter for ingestr.
107+
:::
108+
109+
:::{grid-item-card} Documentation: ingestr CrateDB destination
110+
:link: https://bruin-data.github.io/ingestr/supported-sources/cratedb.html#destination
111+
:link-type: url
112+
Documentation about the CrateDB destination adapter for ingestr.
113+
:::
114+
115+
:::{grid-item-card} Examples: Use ingestr with CrateDB
116+
:link: https://github.com/crate/cratedb-examples/tree/main/application/ingestr
117+
:link-type: url
118+
Executable code examples / rig that demonstrates how to use ingestr to
119+
load data from Kafka to CrateDB.
120+
:::
121+
122+
::::
123+
124+
125+
126+
[databases supported by SQLAlchemy]: https://docs.sqlalchemy.org/en/20/dialects/
127+
[ingestr]: https://bruin-data.github.io/ingestr/
128+
[sources supported by ingestr]: https://bruin-data.github.io/ingestr/supported-sources/

0 commit comments

Comments
 (0)