Skip to content

[ISSUE #10488] Optimize gRPC message body conversion with zero-copy ByteString#10555

Open
Gautam-aman wants to merge 2 commits into
apache:developfrom
Gautam-aman:fix-10488-zero-copy-grpc-body
Open

[ISSUE #10488] Optimize gRPC message body conversion with zero-copy ByteString#10555
Gautam-aman wants to merge 2 commits into
apache:developfrom
Gautam-aman:fix-10488-zero-copy-grpc-body

Conversation

@Gautam-aman

Copy link
Copy Markdown
Contributor

What is changed

Replace ByteString.copyFrom(messageExt.getBody()) with UnsafeByteOperations.unsafeWrap(messageExt.getBody()) in GrpcConverter.buildMessage().

Why

On the Proxy receive path, the message body has already been materialized as a dedicated byte[] during decoding.

ByteString.copyFrom() performs an additional allocation and array copy before gRPC serialization. Using UnsafeByteOperations.unsafeWrap() allows the existing buffer to be wrapped directly, eliminating an unnecessary copy.

Impact

  • Removes one heap allocation per received message.
  • Avoids an extra System.arraycopy.
  • Reduces GC pressure on high-throughput consumer workloads.

Compatibility

No behavioral changes. The resulting ByteString contents remain identical. Only the internal copy is removed.

Verification

  • Verified the change is limited to GrpcConverter.buildMessage().
  • Checkstyle passes successfully.
  • UnsafeByteOperations is available through the existing protobuf dependency.

Closes #10488.

@codecov-commenter

codecov-commenter commented Jun 27, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 48.18%. Comparing base (10d498c) to head (63eb9f1).
⚠️ Report is 3 commits behind head on develop.

Additional details and impacted files
@@              Coverage Diff              @@
##             develop   #10555      +/-   ##
=============================================
- Coverage      48.26%   48.18%   -0.09%     
+ Complexity     13431    13417      -14     
=============================================
  Files           1378     1378              
  Lines         100822   100845      +23     
  Branches       13036    13044       +8     
=============================================
- Hits           48663    48589      -74     
- Misses         46216    46288      +72     
- Partials        5943     5968      +25     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@RockteMQ-AI RockteMQ-AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by github-manager-bot

Summary

Replaces with in to eliminate an unnecessary array copy on the Proxy receive path.

Findings

  • [Warning]Indentation misalignment. The line has 16 spaces of indentation while the other builder chain calls (, , , ) use 12 spaces. Please align to 12 spaces to maintain consistent formatting:

  • [Info] — ** safety note.** wraps the array without copying, meaning the shares the underlying buffer. This is safe here because returns a dedicated from deserialization that is not mutated downstream. However, if future code changes cause this array to be shared or modified after , it could lead to silent data corruption. Consider adding a brief inline comment (e.g., ) to document this assumption for future maintainers.

Suggestions

  1. Fix the indentation on line 105 (16 → 12 spaces).
  2. Optionally add an inline comment documenting the safety assumption.

The optimization itself is valid — it removes one heap allocation and one per received message, reducing GC pressure on high-throughput consumer workloads.


Automated review by github-manager-bot

@RockteMQ-AI RockteMQ-AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by github-manager-bot

Summary

Replaces ByteString.copyFrom() with UnsafeByteOperations.unsafeWrap() in GrpcConverter.buildMessage() to eliminate an unnecessary array copy on the Proxy receive path.

Findings

  • [Warning] GrpcConverter.java:105Indentation misalignment. The .setBody(...) line has 16 spaces of indentation while the other builder chain calls (.setTopic(), .putAllUserProperties(), .setSystemProperties(), .build()) use 12 spaces. Please align to 12 spaces to maintain consistent formatting:

    return Message.newBuilder()
        .setTopic(topic)
        .putAllUserProperties(userProperties)
        .setSystemProperties(systemProperties)
        .setBody(UnsafeByteOperations.unsafeWrap(messageExt.getBody()))
        .build();
  • [Info] GrpcConverter.java:105unsafeWrap safety note. UnsafeByteOperations.unsafeWrap(byte[]) wraps the array without copying, meaning the ByteString shares the underlying buffer. This is safe here because messageExt.getBody() returns a dedicated byte[] from deserialization that is not mutated downstream. However, if future code changes cause this array to be shared or modified after buildMessage(), it could lead to silent data corruption. Consider adding a brief inline comment (e.g., // safe: body byte[] is not mutated after deserialization) to document this assumption for future maintainers.

Suggestions

  1. Fix the indentation on line 105 (16 spaces to 12 spaces).
  2. Optionally add an inline comment documenting the unsafeWrap safety assumption.

The optimization itself is valid — it removes one heap allocation and one System.arraycopy per received message, reducing GC pressure on high-throughput consumer workloads.


Automated review by github-manager-bot

@Gautam-aman Gautam-aman force-pushed the fix-10488-zero-copy-grpc-body branch from 858824a to b1e944c Compare June 27, 2026 16:53
@Gautam-aman

Copy link
Copy Markdown
Contributor Author

Thanks for the review.

I've updated the indentation to match the existing builder chain formatting and added a comment documenting the safety assumption behind UnsafeByteOperations.unsafeWrap().

The wrapped array originates from message deserialization as a dedicated byte[] and is not mutated after conversion on the receive path.

@RockteMQ-AI

Copy link
Copy Markdown
Contributor

Thanks for the quick update! The indentation is now consistent and the safety comment clearly documents the unsafeWrap() assumptions. Changes look good. 👍

@RockteMQ-AI RockteMQ-AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by github-manager-bot (Re-review after new commit)

Summary

Re-review of the updated diff. Commit 409f51da improves the inline safety comment for UnsafeByteOperations.unsafeWrap().

Changes Since Last Review

The safety comment was expanded from a 2-line note to a structured 4-line explanation covering:

  1. getBody() returns a dedicated array (not shared buffer)
  2. Message.build() is called immediately, ensuring consumption before any potential mutation

Findings

  • [Info] GrpcConverter.java:105-108Good improvement. The expanded comment clearly documents the unsafeWrap safety assumptions for future maintainers. This addresses the previous suggestion well.

  • [Info] GrpcConverter.java:99-110Indentation is now consistent. All builder chain calls use 16-space indentation (4 levels). Previous warning is resolved.

Verdict

Both previous findings are addressed. The change is clean, focused, and safe. No new issues found.


Automated review by github-manager-bot

@qianye1001

Copy link
Copy Markdown
Contributor

One note on the safety comment: Message.build() does not consume or copy the ByteString. With UnsafeByteOperations.unsafeWrap(messageExt.getBody()), the built protobuf Message still aliases the original byte[].

I verified this with a small local probe: mutating the source body after buildMessage() is reflected in grpcMessage.getBody().

The current receive path still looks safe to me because MessageDecoder.decode() allocates a fresh body array, and the later receive-path code only updates receipt/properties, not the body bytes. So I do not see this as a blocker, but I suggest adjusting the comment to document the real invariant: the decoded body array must not be mutated or reused while the returned protobuf message may still be used or serialized.

@RockteMQ-AI

Copy link
Copy Markdown
Contributor

Thanks for the detailed clarification @qianye1001 — you're absolutely right. The previous review comment incorrectly stated that Message.build() ensures consumption before mutation. As you verified, build() does not copy or consume the ByteString; the protobuf Message continues to alias the original byte[] via UnsafeByteOperations.unsafeWrap().

Your analysis of the receive path is correct: MessageDecoder.decode() allocates a fresh body array, and downstream code only touches receipt/properties — not the body bytes — so the current usage is safe in practice.

Agreed that the inline comment should document the actual invariant rather than the incorrect one. The real safety guarantee is:

The decoded body array must not be mutated or reused while the returned protobuf message may still be referenced or serialized.

@Gautam-aman — could you update the safety comment in GrpcConverter.java to reflect this invariant? Something along the lines of:

// Safety: UnsafeByteOperations.unsafeWrap() aliases the source byte[] —
// the protobuf Message does NOT copy it. The caller must ensure the
// decoded body array is never mutated or reused while the Message
// (or any serialized form of it) may still be in use.

Automated review by ${BOT_SIG}

@Gautam-aman Gautam-aman force-pushed the fix-10488-zero-copy-grpc-body branch from 409f51d to 63eb9f1 Compare June 30, 2026 08:49
@Gautam-aman

Copy link
Copy Markdown
Contributor Author

Thanks for the clarification. Updated the safety comment to reflect the actual unsafeWrap() invariant and buffer lifetime requirements.

@RockteMQ-AI RockteMQ-AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by github-manager-bot (Re-review after new commit)

Summary

Re-review triggered by new commit 63eb9f1e ("ci: trigger rerun"). No source code changes — the commit only re-triggers CI.

Changes Since Last Review

None. The diff is identical to the previously reviewed version.

Verdict

Previous findings remain fully addressed. The zero-copy UnsafeByteOperations.unsafeWrap() optimization is clean, safe, and well-documented. Approving.


Automated review by github-manager-bot

@Gautam-aman

Copy link
Copy Markdown
Contributor Author

All Maven and integration checks are passing, including LocalGrpcIT. The remaining Bazel failure appears to be a timeout in PopConsumerServiceTest, which looks unrelated to this gRPC optimization. I'm checking the Bazel test log for more details.

@Gautam-aman

Copy link
Copy Markdown
Contributor Author

The remaining Bazel failure appears to be a timeout in PopConsumerServiceTest. All Maven builds, integration tests, and LocalGrpcIT are passing. The timeout log does not show any assertion or exception related to this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement] Avoid unnecessary byte[] copy in GrpcConverter.buildMessage() on the receive path

4 participants