Skip to content

Use fsspec optimized reading for lists of Parquet files #389

@gitosaurus

Description

@gitosaurus

Feature request

In #385, the read_parquet function was enhanced to use the fsspec.parquet.open_parquet_file function for optimal reading of remote files, while using the existing implementation for all the other parts of the read_parquet interface (directories, file-like objects, and lists thereof).

The single-file optimization could also be done for the individual file paths inside of a list as well. It would require some more refactoring, and the benefit would only be for users of remote file paths with that list interface, but they would certainly benefit.

Before submitting
Please check the following:

  • I have described the purpose of the suggested change, specifying what I need the enhancement to accomplish, i.e. what problem it solves.
  • I have included any relevant links, screenshots, environment information, and data relevant to implementing the requested feature, as well as pseudocode for how I want to access the new functionality.
  • If I have ideas for how the new feature could be implemented, I have provided explanations and/or pseudocode and/or task lists for the steps.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions