Skip to content

superserious-dev/parkhay

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Parkhay


Parkhay is an interactive visualization tool for exploring the physical layout of Parquet files. It presents information at the thrift specification level, making it ideal for those who need a detailed understanding of the internal file structure. Raw byte-level data can be previewed on demand for non-metadata sections, eg. Column Chunk pages and Bloom filter bitsets.

Using Parkhay

To run: cargo run -- <path/to/file.parquet>

screenshot of gui, part 1

screenshot of gui, part 2

Roadmap

  • Support object storage reads via OpenDAL
  • Warn when there are regions of the file that are not referenced from the metadata, eg. arbitrary embedded binary content
  • Enable export of raw page bytes to a file on-demand
  • Display user-defined indexes referenced within KeyValue metadata

Current Limitations

  • Files with geospatial types, eg. those in here are not supported
  • Encrypted files are not supported

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages