Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Connection through Arrow Flight RPC / ADBC #913

Open
v-kessler opened this issue Sep 5, 2024 · 2 comments
Open

Support Connection through Arrow Flight RPC / ADBC #913

v-kessler opened this issue Sep 5, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@v-kessler
Copy link

What is the problem the feature request solves?

Rationale

The Arrow ecosystem lacks standard database interfaces built around Arrow data, especially for efficiently fetching large datasets (i.e. with minimal or no serialization and copying). Without a common API, the end result is a mix of custom protocols (e.g. BigQuery, Snowflake) and adapters (e.g. Turbodbc) scattered across languages. Consumers must laboriously wrap individual systems (as DBI is contemplating and Trino does with connectors).

ADBC aims to provide a minimal database client API standard, based on Arrow, for C, Go, and Java (with bindings for other languages). Applications code to this API standard (in much the same way as they would with JDBC or ODBC), but fetch result sets in Arrow format (e.g. via the C Data Interface). They then link to an implementation of the standard: either directly to a vendor-supplied driver for a particular database, or to a driver manager that abstracts across multiple drivers. Drivers implement the standard using a database-specific API, such as Flight SQL.

Goals

  • Provide a cross-language, Arrow-based API to standardize how clients submit queries to and fetch Arrow data from databases.
  • Support both SQL dialects and the emergent Substrait standard.
  • Support explicitly partitioned/distributed result sets to work better with contemporary distributed systems.
  • Allow for a variety of implementations to maximize reach.

Describe the potential solution

The implementation could be done in 3 steps approach

  1. Arrow Flight RPC https://arrow.apache.org/docs/format/Flight.html
  2. Arrow Flight SQL https://arrow.apache.org/docs/format/FlightSql.html
  3. ADBC https://arrow.apache.org/docs/format/ADBC.html

Additional context

No response

@v-kessler v-kessler added the enhancement New feature or request label Sep 5, 2024
@v-kessler
Copy link
Author

@vaibhawvipul as discussed here is the issue

@vaibhawvipul
Copy link
Contributor

@vaibhawvipul as discussed here is the issue

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants