[WIP][Kernel][DefaultEngine] Refactor the Default Engine to not use Hadoop directly, instead provide a generic IO interface #4191

vkorukanti · 2025-02-26T23:05:31Z

Description

Currently Hadoop APIs are used throughout the default Engine. This is a problem for connectors that don't want to depend on the Hadoop. This is an attempt to separate out the I/O related functionality into an interface (FileIO) and provide Hadoop based implementation of FileIO for connector that want to use the default engine with Hadoop. For other connectors that don't want Hadoop, can provide their own implementation of the FileIO to the default engine.

WIP: parquet write side work is remaining.

How was this patch tested?

Existing tests. Need to add code test to not use the Hadoop directly.

vkorukanti added 2 commits February 26, 2025 14:57

wip

8cb315b

wip

c3407a2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][Kernel][DefaultEngine] Refactor the Default Engine to not use Hadoop directly, instead provide a generic IO interface #4191

[WIP][Kernel][DefaultEngine] Refactor the Default Engine to not use Hadoop directly, instead provide a generic IO interface #4191

vkorukanti commented Feb 26, 2025

[WIP][Kernel][DefaultEngine] Refactor the Default Engine to not use Hadoop directly, instead provide a generic IO interface #4191

Are you sure you want to change the base?

[WIP][Kernel][DefaultEngine] Refactor the Default Engine to not use Hadoop directly, instead provide a generic IO interface #4191

Conversation

vkorukanti commented Feb 26, 2025

Description

How was this patch tested?