This project implements and evaluates Integer Wavelet Trees (IWTs) and higher-fanout T-way IWTs as space-efficient mappings between the sorted and physical order of data in learned indexes.
- Evaluate trade-off between compression and access latency.
- Integrate IWTs with learned indexes like RadixSpline.
- Compare against classical indexes like B+-trees and QuIT.
wavelet-tree-mapping/
│
├── src/ # header files for all wavelet tree variants in the paper
├── external/ # 3rd-party dependencies
├── workloads/ # place .bin workload files in this repo
├── results/ # output and logs will go here
├── permute_test.cpp # wavelet tree tests independently
├── li_wavelet_test.cpp # Learned index + wavelet trees
├── li_lsi_test.cpp # Learned index + LSI
├── CMakeLists.txt # Main CMake file
├── external/CMakeLists.txt # dependency CMake file
└── README.md # Project guide
- CMake >= 3.4
- C++20
- Git
Automatically fetched dependencies:
git clone --recurse-submodules https://github.com/BU-Di-SC/wavelet-tree-mapping.git
cd wavelet-tree-mapping
mkdir build
cd build
cmake ..
make <binary_name>Our tests use .bin format for workloads - it can be easily tweaked to accept csv as well.
We use the BoDS framework
- To replicate, use steps from BoDS to generate bin files with varying degrees of sortedness.
- Place the
.binfiles in aworkloads/folder at the project root.
Example:
workloads/
├── n16777216_k0l0.bin # Fully sorted
├── n16777216_k25l25.bin # Partially sorted
├── ...
Example binaries:
permute_flat_multiary— different variants of multiary trees can be selected laterli_wavelet_test— learned index with waveletli_lsi_test— learned index with LSI
B+-tree and QuIT experiments, refer:
Quick Insertion Tree (QuIT)