-
Notifications
You must be signed in to change notification settings - Fork 4
Binary KVReplayGenerator #101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
43d5ebc
to
40bc738
Compare
: ReplayGeneratorBase(config), binaryStream_(config) { | ||
for (uint32_t i = 0; i < numShards_; ++i) { | ||
stressorCtxs_.emplace_back(std::make_unique<StressorCtx>(i)); | ||
std::string_view s{"abc"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
outputStreamKeys_ += nreqs * binReqSize; | ||
folly::setThreadName("cb_binary_gen"); | ||
traceStream_.fastForwardTrace(fastForwardCount_); | ||
genRequests(latch); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need to pass the latch here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In generating the binary file I don't need the latch, but in the other path (reading the CSV file normally) we use the latch to wait for preloading some requests. So it's not used in binary trace generation but I still pass it because it's a parameter to genRequests.
Is there a better way to handle this passing the latch in this case? genRequests does require it (counts down once we have hit a number of requests).
req->req_.ttlSecs, keyOffset); | ||
std::memcpy(outputStreamReqs_ + nreqs, &binReq, sizeof(binReq)); | ||
std::memcpy(outputStreamKeys_ + keyOffset, req->key_.c_str(), keySize); | ||
if ((nreqs % BIN_REQ_INT) == 0 && nreqs > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment for this part of the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
XDCHECK_LT(req->op_, 12); | ||
auto key = req->getKey(binaryStream_.getKeyOffset()); | ||
OpType op; | ||
switch (req->op_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just cast OptType to uint8_t here and avoid the switch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
setEOF(); | ||
} | ||
|
||
const Request& BinaryKVReplayGenerator::getReq(uint8_t, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function (and the entire file) looks pretty similar to KVReplayGenerator.h Can't we unify them? Perhaps make this (and other) function a template (where either BinaryRequest* or std::unique_ptr is the arg)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually pretty hard because we don't have the same underlying request queue logic in the binary trace version.
cachelib/cachebench/util/Request.h
Outdated
ttl_(ttl), | ||
keyOffset_(keyOffset) {} | ||
|
||
std::string_view getKey(char* baseAddr) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you will store offset relative to this
pointer than you do not need to pass baseAddr
as a parameter. The code of the method will look something like this:
std::string_view getKey() const {
return std::string_view(
reinterpret_cast<const char*>(this+ keyOffset_), keySize_);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just figured out that you store keys in separate file, correct? if so, my idea is not feasible. It is only possible if BinaryRequest
objects and corresponding keys are stored in the same file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done - thanks for the note!
------------------------ This offers much lower overhead of trace replaying. It assumes the kvcache trace format and kvcache behavoir. This patch supports the following: - binary request generation and replay - fast forwarding of a trace - preloading requests into memory - object size amplification - queue free for even lower request overhead - can parse many more requests per second than cachelib can process, so we can get 100% CPU usage The limitations are: - no trace amplification (however you can amplify the original .csv trace and save it in binary format) - ~4GB overhead per 100 million requests - you need some disk space to store large traces
cf0a7c1
to
9d49e46
Compare
This change is