Action Precedes Motivation
Read More
- Regular Data structure
- Randomized: Quick Sort, Treap, SkipList
- Misc: List, Stack, Queue, Priority Queue (taken from gods)
- Sorted: Binary Search Tree
- Systems Data structure
- Hash-Based: Probing vs Chaining in Hashmap
- Approximate: Bloom Filter, Count Min Sketch, HyperLogLog
- Rebalancing Map: Consistent Hashing, ChordDHT, Range Based Partitioning, Extendable Hashing
- Hash Functions: Cuckoo Hash, Murmur Hash,
- Sorted: BTree, Binary Search Tree, Treap, Red Black Tree, B Epsilon Tree
- Multi Dimension: KD Tree, Z-Index, Hilbert Curve
- Sampling: Reservoir Sampling
- Stream: Sliding window
- Difference: Merkal Tree
- Time: Hashed Wheel Timer
- Hash-Based: Probing vs Chaining in Hashmap
- Concurrent Data Structures:
- Skip List: Regular, Lazy, LockFree, Arena Skiplist
- BTree
- Radix Tree: ART, VART, Trie
- Fast Succinct Trie: SuRF
- External Memory Algorithms
- External Hashing
- External Sorting
- Performance Analysis
- Heap Dump
- Trace
- Jmeter, YCSB
- VRPC (vector clock)
- Heap View
- Write Benchmark
- Lotsaa
- GC frequency, threshold
- Git Bisect
- Explain Analyze
- OpenTelemetry Trace
- Arena based skip list
- Roaring BitMap
- Zone Map
Read, Extract components, and Learn the inner workings.
Read More
- MatrixOrigin Systems Data Structures Usage
- HyperLogLog: Used to find NDV before writing segment meta header (MO Write SST with Metadata including NDV)
- Bloom Filter: Used for fast range scan through blocks (MO Runtime Filter in Storage Engine)
- MatrixOrigin Join
- Hash Map
- Memory Pool
- Reservoir Sampling
- Storage Engine: WAL, Btree, CoW Btree, LSM Tree
- Memtable using Skip List
Pick a favorite storage engine/main memory system and shrink.
Read More
- Dgraph ristretto
- Dgraph badger: WiscKey, Txn, Memory allocation part in pkg z.
- BigCache:
Revisit papers/slides. Build counterexamples. Think like a scientist.
Read More
- Algorithms and Data Structures for Massive Datasets - BF,
Count-Min
Sketch, HyperLogLog, ReservoirSampling
. - The Art of Multiprocessor Programming: Concurrent Data Structures.
- Advanced Algorithms and Data Structures: BitMap, BloomFilter, LFU, LRU
- 100 Go Mistakes and How to Avoid Them - Concurrency patterns, Mechanical sympathy (last 10 chapters).
- Algorithms for Modern Hardware[Incomplete]: Talk about SIMD, CPU Cache, External Memory, Instruction Level Parallelism.
Write what you understood. (Maybe later)
Read More
- Paper Reading
- A method for implementing lock-free shared data structures
- WiscKey: Separating Keys from Values in SSD-conscious Storage
- A Critique of Snapshot Isolation
Mark your achievements.
Read More
- Kmeans++ & Elkan's Kmeans: Library extracted from MatrixOrigin's IVFFLAT index.