-
Notifications
You must be signed in to change notification settings - Fork 1
try to compare #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
try to compare #20
Conversation
update hotness estimate
add basic filter unit support
restore old version filter
add compile command
Conflicts: .gitignore
fix warning
filter cache client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a full adaptive filter-cache subsystem, including sampling heat via HeatBuckets, allocating filter units with a greedy algorithm and two-heap adjustment, and integrating with a LightGBM-based model and YCSB workloads.
- Add
HeatBucketsandSamplesPoolto track and update key-range hotness - Implement
GreedyAlgo,FilterCacheHeap/FilterCacheHeapManager, andFilterCacheManagerfor dynamic unit allocation - Wire up client APIs (
FilterCacheClient) and model I/O (ClfModel), plus minor YCSB DB changes
Reviewed Changes
Copilot reviewed 80 out of 86 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| db/art/heat_buckets.h/.cc | Sampling and hotness‐tracking buckets |
| db/art/greedy_algo.h/.cc | Greedy OPT unit‐allocation algorithm |
| db/art/filter_cache_item.h/.cc | Skeleton for per‐segment filter units |
| db/art/filter_cache_heap.h | Heap structures for cost/benefit adjustment |
| db/art/filter_cache_client.h/.cc | Async client integration with thread pool |
| db/art/filter_cache.h | Core FilterCache and FilterCacheManager logic |
| db/art/clf_model.h/.cc | CSV I/O & LightGBM server interface |
| YCSB/rocksdb/rocksdb_db.cc | Log workload & switch update to insert |
| YCSB/leveldb/leveldb_db.cc | Remove unused SerializeRow |
Files not reviewed (6)
- CMakeLists.txt: Language not supported
- TARGETS: Language not supported
- YCSB/Makefile: Language not supported
- YCSB/rocksdb/rocksdb.properties: Language not supported
- YCSB/workloads/workloadt: Language not supported
- build_tools/build_detect_platform: Language not supported
Comments suppressed due to low confidence (1)
db/art/clf_model.h:45
LessorComparoris misspelled; considerLessOrComparatoror simplyLessComparator.
inline bool RangeRatePairLessorComparor(const RangeRatePair& pair_1, const RangeRatePair& pair_2);
| class HeatBuckets { | ||
| private: | ||
| // TODO: mutex can be optimized | ||
| static std::vector<std::string> seperators_; |
Copilot
AI
May 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable name seperators_ is misspelled; it should be separators_.
| static std::vector<std::string> seperators_; | |
| static std::vector<std::string> separators_; |
| double enable_benifit; | ||
| SegmentAlgoHelper(const uint32_t& id, const uint32_t& cnt, const uint32_t& size, const uint16_t& units) { | ||
| segment_id = id; visit_cnt = cnt; size_per_unit = size; units_num = units; | ||
| enable_benifit = StandardBenefit(visit_cnt, units_num); | ||
| // assert(units_num <= MAX_UNITS_NUM); | ||
| } | ||
| SegmentAlgoHelper(const uint32_t& id, SegmentAlgoInfo& segment_algo_info) { | ||
| segment_id = id; visit_cnt = segment_algo_info.visit_cnt; | ||
| size_per_unit = segment_algo_info.size_per_unit; units_num = 0; | ||
| enable_benifit = StandardBenefit(visit_cnt, units_num); |
Copilot
AI
May 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field enable_benifit is misspelled; it should be enable_benefit.
| double enable_benifit; | |
| SegmentAlgoHelper(const uint32_t& id, const uint32_t& cnt, const uint32_t& size, const uint16_t& units) { | |
| segment_id = id; visit_cnt = cnt; size_per_unit = size; units_num = units; | |
| enable_benifit = StandardBenefit(visit_cnt, units_num); | |
| // assert(units_num <= MAX_UNITS_NUM); | |
| } | |
| SegmentAlgoHelper(const uint32_t& id, SegmentAlgoInfo& segment_algo_info) { | |
| segment_id = id; visit_cnt = segment_algo_info.visit_cnt; | |
| size_per_unit = segment_algo_info.size_per_unit; units_num = 0; | |
| enable_benifit = StandardBenefit(visit_cnt, units_num); | |
| double enable_benefit; | |
| SegmentAlgoHelper(const uint32_t& id, const uint32_t& cnt, const uint32_t& size, const uint16_t& units) { | |
| segment_id = id; visit_cnt = cnt; size_per_unit = size; units_num = units; | |
| enable_benefit = StandardBenefit(visit_cnt, units_num); | |
| // assert(units_num <= MAX_UNITS_NUM); | |
| } | |
| SegmentAlgoHelper(const uint32_t& id, SegmentAlgoInfo& segment_algo_info) { | |
| segment_id = id; visit_cnt = segment_algo_info.visit_cnt; | |
| size_per_unit = segment_algo_info.size_per_unit; units_num = 0; | |
| enable_benefit = StandardBenefit(visit_cnt, units_num); |
| inline bool FilterCacheHeapNodeLessComparor(const FilterCacheHeapNode& node_1, const FilterCacheHeapNode& node_2); | ||
| inline bool FilterCacheHeapNodeGreaterComparor(const FilterCacheHeapNode& node_1, const FilterCacheHeapNode& node_2); |
Copilot
AI
May 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The word Comparor is misspelled; it should be Comparator.
| inline bool FilterCacheHeapNodeLessComparor(const FilterCacheHeapNode& node_1, const FilterCacheHeapNode& node_2); | |
| inline bool FilterCacheHeapNodeGreaterComparor(const FilterCacheHeapNode& node_1, const FilterCacheHeapNode& node_2); | |
| inline bool FilterCacheHeapNodeLessComparator(const FilterCacheHeapNode& node_1, const FilterCacheHeapNode& node_2); | |
| inline bool FilterCacheHeapNodeGreaterComparator(const FilterCacheHeapNode& node_1, const FilterCacheHeapNode& node_2); |
| // need to generate random integer in [0, old samples_cnt_] (equal to [0, old samples_cnt_ + 1)) | ||
| // new samples_cnt_ = old samples_cnt_ + 1 | ||
| // if you want random integer in [a, b], use (rand() % (b-a+1))+a; | ||
| srand((unsigned)time(NULL)); |
Copilot
AI
May 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeding with srand on each call can lead to predictable random sequences; move srand to a one‐time initialization outside this function.
| srand((unsigned)time(NULL)); |
| std::vector<std::string>& range_seperators() { | ||
| return filter_cache_manager_.range_seperators(); |
Copilot
AI
May 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Method name range_seperators is misspelled; it should be range_separators.
| std::vector<std::string>& range_seperators() { | |
| return filter_cache_manager_.range_seperators(); | |
| std::vector<std::string>& range_separators() { | |
| return filter_cache_manager_.range_separators(); |
No description provided.