Skip to content

Conversation

@AndreaBozzo
Copy link
Owner

@AndreaBozzo AndreaBozzo commented Jan 20, 2026

Work in Progress - ConnectorX Integration

Experimental work to integrate ConnectorX as an alternative/complement to existing sqlx connectors for database → Arrow loading.

Meant for #185

What Has Been Done

  • Added connectorx dependency with feature flags
  • Implemented ConnectorXLoader for Postgres/MySQL
  • Feature flags: connectorx-postgres, connectorx-mysql, connectorx-all, production-cx
  • Unit tests for config builder and database type detection

Known Issues

1. Arrow Version Mismatch (BLOCKER)

ConnectorX uses Arrow v54.3.1, while the project uses Arrow v57.2.0.
Error: expected arrow::array::RecordBatch, found arrow_array::record_batch::RecordBatch

This is a fundamental incompatibility - two different major versions of the arrow-array crate are being used, and Rust treats them as completely different types.

2. libsqlite3-sys Conflict

connectorx/src_sqlite and sqlx/sqlite link different versions of libsqlite3-sys.
SQLite support via ConnectorX is temporarily disabled.

Potential Solutions

Option 1: connector_arrow (Recommended)

A lighter fork of ConnectorX focused on Rust usage:

Option 2: Arrow IPC Conversion

  • Serialize RecordBatch via Arrow IPC (wire format)
  • Deserialize with the correct Arrow version
  • Adds overhead but works across version boundaries

Option 3: Downgrade Arrow

  • Move to Arrow 54/55 project-wide
  • Not ideal - would lose DataFusion 52 features

Documentation References

TODO

  • Resolve Arrow version incompatibility
  • Test with real databases
  • Benchmark against sqlx connectors
  • Document public API
  • Add usage examples

Technical Notes

# Available feature flags
connectorx-postgres = ["dep:connectorx", "connectorx/src_postgres", "connectorx/dst_arrow", "connectorx/fptr"]
connectorx-mysql = ["dep:connectorx", "connectorx/src_mysql", "connectorx/dst_arrow", "connectorx/fptr"]

Main implementation: src/engines/connectorx_loader.rs


Note: This is an experimental branch - do not merge without resolving the Arrow version compatibility issue.

- Add connectorx dependency with feature flags
- Implement ConnectorXLoader for Postgres/MySQL
- Known issue: Arrow version mismatch (connectorx uses v54, we use v57)
- connectorx-sqlite disabled due to libsqlite3-sys conflict with sqlx

This is experimental work, needs resolution of Arrow version incompatibility.
Comment on lines +3664 to +3671
[[package]]
name = "owning_ref"
version = "0.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6ff55baddef9e4ad00f88b6c743a2a8062d4c6ade126c2a528644b8e444d52ce"
dependencies = [
"stable_deref_trait",
]

Check warning

Code scanning / Trivy

owning_ref vulnerable to multiple soundness issues Medium

Package: owning_ref
Installed Version: 0.4.1
Vulnerability GHSA-9qxh-258v-666c
Severity: MEDIUM
Fixed Version:
Link: GHSA-9qxh-258v-666c
Comment on lines +3197 to +3204
[[package]]
name = "lru"
version = "0.12.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "234cf4f4a04dc1f57e24b96cc0cd600cf2af460d4161ac5ecdd0af8e1f3b2a38"
dependencies = [
"hashbrown 0.15.5",
]

Check notice

Code scanning / Trivy

`IterMut` violates Stacked Borrows by invalidating internal pointer Low

Package: lru
Installed Version: 0.12.5
Vulnerability GHSA-rhfx-m35p-ff5j
Severity: LOW
Fixed Version: 0.16.3
Link: GHSA-rhfx-m35p-ff5j
@AndreaBozzo AndreaBozzo changed the title WIP: ConnectorX integration for high-performance database loading WIP: arrow_connector / ConnectorX integration for high-performance database loading Jan 20, 2026
@AndreaBozzo
Copy link
Owner Author

Code's bad but this stay here until connectorx upgrades tò 57 on Arrow so i can evaluate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants