-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement streaming lz4 compession #1611
base: master
Are you sure you want to change the base?
Conversation
本地Makefile编译没问题,travis-ci编译不过。 |
@@ -0,0 +1,2495 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要更新下LICENSE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LICENSE更新完成了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个文件需要加上额外的namespace,不然如果应用方也使用了lz4的话,会造成链接冲突
src/brpc/policy/lz4_compress.cpp
Outdated
size_t ref_cnt = in.backing_block_num(); | ||
LZ4_stream_t* lz4_stream = LZ4_createStream(); | ||
butil::IOBuf block_buf; | ||
std::vector<size_t> block_metas; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不要用size_t, 这个跨架构大小不是确定的。 这里得用int32或者int16.
src/brpc/policy/lz4_compress.cpp
Outdated
block_metas.emplace_back(src_block_size); | ||
} | ||
size_t nblocks = block_metas.size() / 2; | ||
out->append(&nblocks, sizeof(size_t)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不能直接这么写,需要转换成网络序。另外同上,这里不能用size_t
src/brpc/policy/lz4_compress.cpp
Outdated
return false; | ||
} | ||
std::vector<size_t> block_metas(nblocks * 2, 0); | ||
buf_iter.copy_and_forward(block_metas.data(), nblocks * 2 * sizeof(size_t)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些都得实现为序列化,这里如果考虑压缩的话,可以用varint encoding(protobuf应该有类似的接口)
return false; | ||
} | ||
LZ4_streamDecode_t* lz4_stream_decode = LZ4_createStreamDecode(); | ||
char* in_scratch = new char[max_block]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
用DEFNIE_SMALL_ARRAY, 大部分情况下应该不会很大.
LZ4_freeStreamDecode(lz4_stream_decode); | ||
return false; | ||
} | ||
out->append_user_data(out_buf, dst_block_size, [](void *d) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iobuf as zero copy stream这里无法使用么, 必须依赖user_data?
src/brpc/global.cpp
Outdated
@@ -386,6 +387,11 @@ static void GlobalInitializeOrDieImpl() { | |||
if (RegisterCompressHandler(COMPRESS_TYPE_SNAPPY, snappy_compress) != 0) { | |||
exit(1); | |||
} | |||
const CompressHandler lz4_compress = | |||
{ Lz4Compress, Lz4Decompress, "lz4" }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lz4有 frame, block, stream不同模式(记忆中block似乎是压缩比最高的). 这里要么就改成lz4s. 而不是直接用lz4.
butil::IOBufAsZeroCopyOutputStream wrapper(&serialized_pb); | ||
if (res.SerializeToZeroCopyStream(&wrapper)) { | ||
return Lz4Compress(serialized_pb, buf); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议写一下文档先讨论下对应压缩的wire format, 这里似乎并不是效率最高的。 结合zero copy stream, 这里应该能做到一边序列化一边压缩(至少传输格式上需要能保留这种实现可能)。 现在还是多构造了一次中间数据。
这个PR有考虑继续跟进吗? |
我这周再优化下。 |
这个PR有考虑继续跟进吗? |
Compress method Compress size(B) Compress time(us) Decompress time(us) Compress throughput(MB/s) Decompress throughput(MB/s) Compress ratio
Snappy 128 0.630420 0.676840 193.633312 180.353278 37.500000%
Gzip 128 4.886340 0.804860 24.981952 151.666517 47.656250%
Zlib 128 4.376520 0.665340 27.892095 183.470575 38.281250%
Lz4 128 0.614140 0.359200 198.766263 339.839400 58.115935%
LZ4 compress/decompress throughput outperform snappy, gzip and zlib.