-
Notifications
You must be signed in to change notification settings - Fork 129
chore(pegboard): log explicit gc reason for pending requests #3327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
3 Skipped Deployments
|
Code Review: PR #3327 - Log explicit GC reason for pending requestsSummaryThis PR improves debugging visibility by adding explicit logging of the reason why in-flight requests are being garbage collected. The refactoring changes the GC logic from a boolean flag to an enum-based reason system. Positive Aspects✅ Improved Observability: Adding explicit GC reasons ( ✅ Better Code Structure: The enum-based approach is more explicit and self-documenting than the previous boolean ✅ Follows Project Conventions:
✅ Logic Correctness: The timeout logic has been inverted correctly (using Issues & Concerns🐛 Critical Logic BugLines 428-432 and 438-442: The timeout comparison logic is inverted. The current code breaks to return Current (incorrect) logic: if now.duration_since(earliest_pending_msg.send_instant) <= MESSAGE_ACK_TIMEOUT {
break 'reason Some(MsgGcReason::MessageNotAcked); // Wrong!
}Should be: if now.duration_since(earliest_pending_msg.send_instant) > MESSAGE_ACK_TIMEOUT {
break 'reason Some(MsgGcReason::MessageNotAcked);
}This will cause premature garbage collection of requests that are still waiting for acknowledgments within the valid 30-second timeout window. Original code comparison: // Old code (correct):
keep = now.duration_since(earliest_pending_msg.send_instant) > MESSAGE_ACK_TIMEOUT;
// If duration > timeout, keep=true (don't GC)
// If duration <= timeout, keep=false (do GC) ❌ Wait, this was also wrong!Actually, reviewing the original code more carefully: keep = now.duration_since(earliest_pending_msg.send_instant) > MESSAGE_ACK_TIMEOUT;
// Then at the end: if !keep { gc() }So the original logic was: if duration > timeout, then keep=true (don't GC). The new code should maintain this behavior. The new code returns
Therefore, you want to set if now.duration_since(earliest_pending_msg.send_instant) > MESSAGE_ACK_TIMEOUT {
break 'reason Some(MsgGcReason::MessageNotAcked);
}
|
7895f28 to
546e885
Compare
2769824 to
43b9be8
Compare
Code ReviewSummaryThis PR improves observability in the garbage collection process by adding explicit logging for why pending requests are being GC'd. The refactoring makes the timeout logic clearer and adds structured logging to track which condition triggered the cleanup. Positive Observations✅ Improved Observability: Adding the ✅ Better Code Structure: The refactored logic using a labeled block ( ✅ Structured Logging: Following the project's structured logging conventions correctly (using ✅ Correct Naming: Using lowercase log messages per CLAUDE.md conventions. Issues & Concerns🐛 Critical Bug: Inverted LogicLines 428-432 and 438-442: The timeout logic appears to be inverted. The code currently GCs requests when the duration is less than or equal to the timeout, but it should GC when the duration is greater than: // Current (incorrect):
if now.duration_since(earliest_pending_msg.send_instant) <= MESSAGE_ACK_TIMEOUT {
break 'reason Some(MsgGcReason::MessageNotAcked);
}
// Should be:
if now.duration_since(earliest_pending_msg.send_instant) > MESSAGE_ACK_TIMEOUT {
break 'reason Some(MsgGcReason::MessageNotAcked);
}The original code had: keep = now.duration_since(earliest_pending_msg.send_instant) > MESSAGE_ACK_TIMEOUT;Which meant "keep the request if the timeout has NOT been exceeded" (duration is still within timeout). The new logic incorrectly inverts this. Impact: This will cause requests to be GC'd immediately if they're within the timeout period, rather than waiting for them to actually timeout. This is a critical functional regression. Recommendation: Change both 📝 Minor: Incomplete Logging ContextThe tracing::debug!(
request_id=?Uuid::from_bytes(*request_id),
?reason,
elapsed_ms = now.duration_since(earliest_send_instant).as_millis(),
"gc collecting in flight request"
);This would help operators understand how far past the timeout a request was when collected. Testing Recommendations
Performance ConsiderationsNo performance concerns - the refactoring maintains similar performance characteristics. Security ConsiderationsNo security issues identified. Verdict: ❌ Request Changes - The inverted timeout logic is a critical bug that needs to be fixed before merging. |
Code ReviewSummaryThis PR improves observability in the garbage collection logic by adding explicit logging of GC reasons when pending requests timeout. The changes refactor the conditional logic to be more explicit and easier to debug. Positive Changes ✓
Logic Issue Found
|
43b9be8 to
d922f4c
Compare
546e885 to
0423a08
Compare
|
Deployment failed with the following error: Learn More: https://vercel.com/docs/limits#rate-limits |
Code ReviewSummaryThis PR improves observability in the pegboard gateway's garbage collection logic by adding explicit logging when in-flight requests are collected, along with the specific reason for collection. Positive Changes✅ Improved Observability: The addition of explicit GC reasons (MessageNotAcked vs WebSocketMessageNotAcked) makes debugging much easier. When requests are timing out, you'll now know exactly which type of message failed to be acknowledged. ✅ Better Code Structure: The refactoring using a labeled block ('reason) and enum makes the control flow clearer and more maintainable than the previous boolean-based approach. ✅ Follows Logging Conventions: The logging follows the project's structured logging patterns correctly:
✅ Logic Correctness: The timeout logic has been inverted correctly. The old code kept requests when duration > timeout (incorrect), while the new code GCs requests when duration > timeout (correct, based on the logic inversion at line 458). Minor ObservationsLogic Clarity: The condition on lines 428-431 might be slightly confusing on first read: The code breaks with a GC reason when the message is not yet timed out, which feels counterintuitive. However, this is correct because the final check inverts the logic - requests are kept when reason.is_none(), meaning they haven't exceeded the timeout yet. Consider adding a clarifying comment to make it more explicit that we're checking if the timeout has been exceeded, but this is optional - the current implementation is correct. Test CoverageNo tests were added, but this is reasonable for a logging/observability improvement that doesn't change functional behavior. Performance & Security
ConclusionApproved - This is a solid observability improvement that will help with debugging timeout issues in production. The code is correct, follows conventions, and makes the GC logic more maintainable. 🤖 Review generated by Claude Code |
0423a08 to
dcca850
Compare
Code ReviewSummaryThis PR improves observability in the garbage collection process by adding explicit logging for the reason why pending requests are being collected. The changes introduce a Positive Aspects ✅
Issues & Concerns
|
Merge activity
|
dcca850 to
e4d3b77
Compare
PR Review: Log explicit GC reason for pending requestsSummaryThis PR improves observability in the garbage collection logic by adding explicit logging of why in-flight requests are being collected. The refactoring makes the code more maintainable and debuggable. Positive Aspects✅ Improved Observability: Adding explicit logging with the GC reason ( ✅ Better Code Structure: The refactoring from a mutable ✅ Follows Logging Conventions: The logging follows the repository's structured logging patterns using ✅ Appropriate Debug Level: Using Issues Found🐛 Critical Bug: Inverted Timeout Logic The timeout comparison logic is inverted on lines 428-430 and 438-440: if now.duration_since(earliest_pending_msg.send_instant) <= MESSAGE_ACK_TIMEOUT {
break 'reason Some(MsgGcReason::MessageNotAcked);
}This says "if the message age is less than or equal to the timeout, mark it for GC". This is backwards! Expected behavior: GC should happen when The original code had: keep = now.duration_since(earliest_pending_msg.send_instant) > MESSAGE_ACK_TIMEOUT;Which correctly kept the request when duration was greater than timeout (meaning it should be GC'd). Fix: Change both conditions from if now.duration_since(earliest_pending_msg.send_instant) > MESSAGE_ACK_TIMEOUT {
break 'reason Some(MsgGcReason::MessageNotAcked);
}And similarly for the WebSocket message check. Minor SuggestionsDocumentation: Consider adding a doc comment to the Variable Naming: The variable Test Coverage
Consider adding tests in Security & Performance✔️ Security: No security concerns identified Action Required: Please fix the inverted timeout logic before merging. |

No description provided.