DAOS-15847 object: restore iov_len for fetch on dup-only (no-merge) SGLs#18413
Conversation
In obj_dup_sgls_free(), when ctx->alloc_bitmaps[i] == NULL, the function skips
the entire SGL with `continue`:
if (!ctx->alloc_bitmaps || !ctx->alloc_bitmaps[i])
continue;
This case arises when a SGL was duplicated only to strip iov_buf_len==0 entries
(skip_sgl_iov returned true) but no IOVs were merged into allocated buffers.
During construction (second pass of obj_sgls_dup), non-merged IOVs are copied as:
*iov_dup = *iov;
So iov_dup->iov_buf == iov->iov_buf (same pointer). The lower layer writes
fetch data into iov_dup->iov_buf and updates iov_dup->iov_len to reflect actual
bytes read.
Because obj_dup_sgls_free skips such SGLs, iov->iov_len in the *original* SGL
is never updated. After api_args->sgls is restored to orr_usgls, the caller sees
stale pre-fetch iov_len values even though the data is in the correct buffers.
Signed-off-by: Xuezhao Liu <xuezhao.liu@hpe.com>
|
Ticket title is 'refine rebuild's error handling' |
|
Test stage NLT completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-18413/1/testReport/ |
|
Test stage Functional on EL 9 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-18413/1/execution/node/982/log |
|
Test stage Functional Hardware Medium MD on SSD completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-18413/2/testReport/ |
|
only one known failure of dfuse/daos_build.py not related with this PR. |
|
@liuxuezhao Will you please merge latest master since CI should be clean now? I will watch this PR and merge once it makes it through CI cleanly |
sure, merged |
In obj_dup_sgls_free(), when ctx->alloc_bitmaps[i] == NULL, the function skips the entire SGL with
continue:if (!ctx->alloc_bitmaps || !ctx->alloc_bitmaps[i])
continue;
This case arises when a SGL was duplicated only to strip iov_buf_len==0 entries (skip_sgl_iov returned true) but no IOVs were merged into allocated buffers. During construction (second pass of obj_sgls_dup), non-merged IOVs are copied as:
*iov_dup = *iov;
So iov_dup->iov_buf == iov->iov_buf (same pointer). The lower layer writes fetch data into iov_dup->iov_buf and updates iov_dup->iov_len to reflect actual bytes read.
Because obj_dup_sgls_free skips such SGLs, iov->iov_len in the original SGL is never updated. After api_args->sgls is restored to orr_usgls, the caller sees stale pre-fetch iov_len values even though the data is in the correct buffers.
Steps for the author:
After all prior steps are complete: