-
Notifications
You must be signed in to change notification settings - Fork 4
Background data movement #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 24 files at r1.
Reviewable status: 5 of 24 files reviewed, 3 unresolved discussions (waiting on @byrnedj and @vinser52)
cachelib/allocator/BackgroundMover.h
line 68 at r1 (raw file):
~BackgroundMover() override;
you could try to run clang-format on those new files - it should get rid of some of those unnecessary white chars
cachelib/allocator/BackgroundMover-inl.h
line 87 at r1 (raw file):
} totalBytesMoved.add(batch * mpStats.acStats.at(cid).allocSize);
shouldn;t we use moved
instead of batch
for this calculation?
cachelib/allocator/CacheAllocatorConfig.h
line 629 at r1 (raw file):
double minAcAllocationWatermark{0.0};
Rest of the options (sizeThresholdPolicy, etc.) will be in a separate PR, ritght? I think you might want to remove mention of sizeThresholdPolicy from the .md file for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps we could just extend the test a little bit
Reviewable status: 5 of 24 files reviewed, 3 unresolved discussions (waiting on @byrnedj and @vinser52)
cachelib/allocator/tests/AllocatorMemoryTiersTest.h
line 112 at r1 (raw file):
auto perclassPstats = allocator->getBackgroundMoverClassStats(MoverDir::Promote); EXPECT_GT(1, stats.evictionStats.numMovedItems);
Can we add a more restrictive test? Since we know the values of watermarks we could test if specified percent of memory is free in the first tier after BG workers do their work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: 5 of 24 files reviewed, 7 unresolved discussions (waiting on @byrnedj, @igchor, and @vinser52)
MultiTierDataMovement.md
line 15 at r1 (raw file):
## Synchronous Eviction and Promotion - `disableEvictionToMemory`: Disables eviction to memory (item is always evicted to NVMe or removed
Do I understand correctly that tiering would not work in that case? Or we are going to use first-free policy for allocations?
MultiTierDataMovement.md
line 26 at r1 (raw file):
- `backgroundEvictorIntervalMilSec`: The interval that this thread runs for - by default the background evictor threads will wake up every 10 ms to scan the AllocationClasses. Also, the background evictor thead will be woken up everytime there is a failed allocation (from
misprint: thead - > thread
MultiTierDataMovement.md
line 33 at r1 (raw file):
- `evictorThreads`: The number of background evictors to run - each thread is a assigned a set of AllocationClasses to scan and evict objects from. Currently, each thread gets an equal number of classes to scan - but as object size distribution may be unequal - future
I am not sure that we should write about future. We can change this description when balancing is implemented.
MultiTierDataMovement.md
line 62 at r1 (raw file):
## Background Promoters The background promotes scan each class to see if there are objects to move to a lower
Misprint? promotes -> promoters
Yes. Tiering does not work if this option is on - we could implement first free or some other allocation policy if this is disabled - see #10. I removed documentation for admission policies from this version and will add it to #10.
The implementation is complete - I can add it to this PR or wait until this one has been merged to develop.
Done. And reformatted code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 24 files at r1, 1 of 10 files at r2, 1 of 3 files at r3.
Reviewable status: 4 of 24 files reviewed, 5 unresolved discussions (waiting on @byrnedj, @igchor, and @vinser52)
cachelib/allocator/tests/AllocatorMemoryTiersTest.h
line 100 at r3 (raw file):
//wait for bg movers std::this_thread::sleep_for(std::chrono::seconds(1));
Is there a way to deterministically wait for BG threads completion? Sleeps like this tend to be hard to debug and cause random failures.
Maybe we could just spin until numMovedItems (or freePercent) is what we expect it to be, and then do all the checks?
cachelib/cachebench/util/CacheConfig.cpp
line 100 at r3 (raw file):
if (configJson.count("memoryTiers")) { for (auto& it : configJson["memoryTiers"]) { memoryTierConfigs.push_back(MemoryTierConfig(it).getMemoryTierCacheConfig());
where is this code now?
Done.
Fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 24 files at r1, 5 of 10 files at r2, 1 of 3 files at r3, 5 of 8 files at r4, all commit messages.
Reviewable status: 24 of 28 files reviewed, 6 unresolved discussions (waiting on @byrnedj, @igchor, and @vinser52)
cachelib/allocator/BackgroundMover.h
line 20 at r4 (raw file):
#include <folly/concurrency/UnboundedQueue.h> #include <gtest/gtest_prod.h>
why is this needed?
cachelib/allocator/MMTinyLFU-inl.h
line 235 at r4 (raw file):
void MMTinyLFU::Container<T, HookPtr>::withPromotionIterator(F&& fun) { throw std::runtime_error("Not supported");
Can you maybe open some issue to implement this in future?
cachelib/external/fbthrift
line 1 at r4 (raw file):
Subproject commit 74f3a8fb00e3963d1e6f2361fce722688d500417
Can you revert updating the dependencies? It will cause conflicts later
5ec282b
to
dc7f5de
Compare
It's not - it was leftover from development. I have removed it.
Sure, done.
Yes, sorry about that. I have reverted them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is just one test failing on the CI (not sure if it's random or not)
Reviewable status: 22 of 28 files reviewed, 4 unresolved discussions (waiting on @byrnedj, @igchor, and @vinser52)
When I run the tests locally I do not see any failures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 24 files at r1, 4 of 10 files at r2, 1 of 3 files at r3, 2 of 8 files at r4, 1 of 2 files at r5, 3 of 3 files at r6.
Reviewable status: 27 of 28 files reviewed, 2 unresolved discussions (waiting on @byrnedj and @igchor)
MultiTierDataMovement.md
line 15 at r1 (raw file):
Previously, vinser52 (Sergei Vinogradov) wrote…
Do I understand correctly that tiering would not work in that case? Or we are going to use first-free policy for allocations?
Should it be renamed to disableMovement
. According to the current description, it looks like we are prohibiting item movement between tiers. As I remember in the legacy implementation this option does not allow throwing away any item implicitly by eviction from the cache. Now according to the comment, the item can be removed by eviction.
cachelib/allocator/BackgroundMover.h
line 97 at r6 (raw file):
AtomicCounter totalBytesMoved{0}; std::vector<std::tuple<TierId, PoolId, ClassId>> assignedMemory_;
Since we are using std::tuple<TierId, PoolId, ClassId>
definition in multiple places, should it be some alias for that, e.g. using memory_descriptor_type = std::tuple<TierId, PoolId, ClassId>;
. In case we want to change this type in the future it is only a single place where it is defined.
Probably, it is even better to have a separate class/struct instead of tuple, but it is up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 20 of 28 files reviewed, 2 unresolved discussions (waiting on @igchor and @vinser52)
MultiTierDataMovement.md
line 15 at r1 (raw file):
Previously, vinser52 (Sergei Vinogradov) wrote…
Should it be renamed to
disableMovement
. According to the current description, it looks like we are prohibiting item movement between tiers. As I remember in the legacy implementation this option does not allow throwing away any item implicitly by eviction from the cache. Now according to the comment, the item can be removed by eviction.
It appears that disableEviction will be deprecated. I have removed this from the documentation.
cachelib/allocator/BackgroundMover.h
line 97 at r6 (raw file):
Previously, vinser52 (Sergei Vinogradov) wrote…
Since we are using
std::tuple<TierId, PoolId, ClassId>
definition in multiple places, should it be some alias for that, e.g.using memory_descriptor_type = std::tuple<TierId, PoolId, ClassId>;
. In case we want to change this type in the future it is only a single place where it is defined.
Probably, it is even better to have a separate class/struct instead of tuple, but it is up to you.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 8 of 8 files at r7.
Reviewable status: all files reviewed (commit messages unreviewed), 2 unresolved discussions (waiting on @vinser52)
Background data movement using periodic workers. Attempts to evict/promote items per given thresholds for each class. These reduce p99 latency since there is a higher chance that an allocation slot is free in the tier we are allocating in.
Background data movement using periodic workers. Attempts to evict/promote items per given thresholds for each class. These reduce p99 latency since there is a higher chance that an allocation slot is free in the tier we are allocating in.
Background data movement using periodic workers. Attempts to evict/promote items per given thresholds for each class. These reduce p99 latency since there is a higher chance that an allocation slot is free in the tier we are allocating in.
Use docker with prebuild dependencies
Background data movement using periodic workers. Attempts to evict/promote items per given thresholds for each class. These reduce p99 latency since there is a higher chance that an allocation slot is free in the tier we are allocating in. fix race in promotion where releaseBackToAllocator was being called before wakeUpWaiters. reinsert to mm container on failed promotion
Background data movement using periodic workers. Attempts to evict/promote items per given thresholds for each class. These reduce p99 latency since there is a higher chance that an allocation slot is free in the tier we are allocating in. fix race in promotion where releaseBackToAllocator was being called before wakeUpWaiters. reinsert to mm container on failed promotion
This is the second part of: #2
This change is