-
Notifications
You must be signed in to change notification settings - Fork 129
fix(runner): fix http requests not returning 503 when actor removed #3322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
3 Skipped Deployments
|
Code ReviewSummaryThis PR correctly fixes HTTP requests not returning 503 when an actor is removed by changing the status code from 404 to 503 and adding the required Positive Feedback✅ Correct Implementation: The change properly implements the Guard retry protocol:
✅ Good Documentation: The inline comment with GitHub link provides excellent context for future maintainers ✅ Consistent Pattern: Matches the retry documentation in engine/contrib-docs/GUARD.md Issues & Recommendations1. Inconsistency with WebSocket Handling (Minor)The WebSocket path handles the same scenario differently at tunnel.ts:466-481:
Question: Should WebSockets also retry when actors are not found? If yes, consider setting 2. Missing Test Coverage (Moderate)This is a critical retry path but appears to lack test coverage:
Recommendation: Add a test case to verify: // Pseudo-code
test('returns 503 with retry header when actor not found', async () => {
const response = await tunnel.fetch(unknownActorId, request);
expect(response.status).toBe(503);
expect(response.headers.get('x-rivet-error')).toBe('runner.actor_not_found');
});3. Error Code Validation (Minor)The error code 4. Response Body Clarity (Minor)The response body "Actor not found" might be confusing since this is a retryable 503, not a final "not found" error. Consider a body like "Actor temporarily unavailable" to better reflect the transient nature. Security Considerations✅ No security concerns identified. The change appropriately handles missing actors without exposing internal state. Performance Considerations✅ No performance concerns. The change is a simple status code + header modification with negligible overhead. Code Quality✅ Follows TypeScript best practices Overall AssessmentAPPROVED with minor recommendations. The core fix is correct and well-implemented. The main suggestion is to add test coverage for this critical retry path to prevent future regressions. |
65f206c to
b372bb1
Compare
b62cf5d to
4fb3477
Compare
Code ReviewSummaryThis PR correctly fixes HTTP requests not returning 503 when an actor is removed by changing the status code from 404 to 503 and adding the required Positive Feedback✅ Correct Implementation: The change properly implements the Guard retry protocol:
✅ Good Documentation: The inline comment with GitHub link provides excellent context for future maintainers ✅ Consistent Pattern: Matches the retry mechanism used throughout the Guard proxy service Issues & Recommendations1. Inconsistency with WebSocket Handling (Minor)The WebSocket path handles the same scenario differently - it sends a close with Question: Should WebSockets also retry when actors are not found? If yes, consider setting 2. Missing Test Coverage (Moderate Priority)This is a critical retry path but appears to lack test coverage:
Recommendation: Add a test case to verify the retry behavior works end-to-end, especially since this is part of a critical error handling path. 3. Error Code Validation (Minor)The error code 4. Response Body Clarity (Minor)The response body "Actor not found" might be confusing since this is a retryable 503, not a final "not found" error. Consider a body like "Actor temporarily unavailable" to better reflect the transient nature of the error. Security & Performance✅ No security concerns identified Code Quality✅ Follows TypeScript best practices Overall AssessmentAPPROVED with recommendations. The core fix is correct and well-implemented. The main suggestion is to add test coverage for this critical retry path to prevent future regressions. |
4fb3477 to
f89391d
Compare
b372bb1 to
102e612
Compare
|
Deployment failed with the following error: Learn More: https://vercel.com/docs/limits#rate-limits |
Code ReviewSummaryThis PR fixes HTTP requests to return 503 with retry logic when an actor is not found, instead of returning a 404. The change aligns with Guard's retry mechanism that checks for Positive Observations✅ Correct Fix: The change properly implements the retry protocol expected by Guard. The
✅ Good Documentation: The inline comment with the GitHub link to the Guard retry logic is excellent for maintainability. ✅ Consistent Error Handling: Using Issues & Concerns1. Inconsistent WebSocket Behavior (Medium Priority)The WebSocket handler has a similar "actor not found" case at // Line 467-481: WebSocket handling
if (!actor) {
logger()?.warn({
msg: "ignoring websocket for unknown actor",
actorId: open.actorId,
});
// Send close immediately
this.#sendMessage(requestId, {
tag: "ToServerWebSocketClose",
val: {
code: 1011,
reason: "Actor not found",
retry: false, // Should this be true?
},
});
return;
}2. Error Code Registration (Low Priority)The error code 3. Missing Test Coverage (Medium Priority)No test appears to cover this specific scenario. Consider adding a test case to verify:
Minor Observations
RecommendationApprove with suggested follow-ups:
The core fix is correct and addresses the issue properly. The suggested improvements are for consistency and robustness but don't block this PR. |
f89391d to
cadd5cb
Compare
102e612 to
bd0a1b7
Compare
Code Review - PR #3322SummaryThis PR fixes a critical bug where HTTP requests were returning a 404 status code when an actor was not found, instead of a 503 that would trigger Guard's retry logic. The change correctly implements the retry protocol by returning 503 with the Analysis✅ Strengths
🔍 Observations & Recommendations1. WebSocket Inconsistency (Minor) The WebSocket handler at // Current WebSocket behavior
this.#sendMessage(requestId, {
tag: "ToServerWebSocketClose",
val: {
code: 1011,
reason: "Actor not found",
retry: false, // ← Should this match HTTP's retry behavior?
},
});If the different behavior is intentional, consider adding a comment explaining why HTTP retries but WebSockets don't. 2. Test Coverage Gap (Moderate Priority) This is a critical error path but appears to lack test coverage. Consider adding tests to verify:
3. Response Body Message (Minor) The response body "Actor not found" might be misleading since 503 indicates a temporary condition. Consider using "Actor temporarily unavailable" to better communicate that this is a retryable error rather than a permanent "not found" condition. Security & Performance✅ No security concerns: The change appropriately handles missing actors without exposing internal state Code Quality✅ Follows TypeScript conventions VerdictAPPROVED ✅ The core fix is correct and properly addresses the issue. The recommendations above are for future improvements and don't block this PR. This is a solid bug fix that will improve the reliability of actor request handling. |
PR Review: Fix HTTP requests returning 503 when actor removedSummaryThis PR changes the response code from Code Quality: ✅ ExcellentStrengths:
Best Practices: ✅ GoodObservations:
Potential Issues:
|
Merge activity
|
bd0a1b7 to
6f5d613
Compare
cadd5cb to
0f73c04
Compare
PR Review: Fix HTTP requests not returning 503 when actor removedSummaryThis PR changes the response status code from Code Quality & Best Practices ✅Strengths:
Observations:
Correctness & Logic ✅The fix correctly addresses the race condition where:
The implementation matches the retry contract in fn should_retry_request_inner(status: StatusCode, headers: &hyper::HeaderMap) -> bool {
status == StatusCode::SERVICE_UNAVAILABLE && headers.contains_key(X_RIVET_ERROR)
}Performance Considerations ✅Positive impact:
No concerns - this is exactly the behavior Guard's retry mechanism was designed for. Security Concerns ✅No security issues identified:
Test Coverage
|
PR Review: Fix HTTP requests not returning 503 when actor removedSummaryThis PR changes the response when an HTTP request arrives for an actor that no longer exists on the runner, from a 404 Not Found to a 503 Service Unavailable with the x-rivet-error: runner.actor_not_found header. This enables Guard's retry mechanism to automatically attempt the request against a fresh routing lookup. Positive Aspects
Concerns & Recommendations1. Missing Test CoverageThere are no tests covering this scenario. I'd recommend adding: Integration test in engine/packages/guard-core/tests/proxy.rs:
Runner SDK test (if applicable):
2. Error Code ConsistencyThe error code runner.actor_not_found follows a different naming convention than Guard's errors:
Question: Is this intentional to distinguish between "Guard can't find actor" vs "Runner doesn't have actor"? If so, this is good. Otherwise, consider consistency. 3. Retry ImplicationsCurrent behavior:
Edge case consideration:
This seems acceptable, but worth documenting the expected behavior when an actor is permanently destroyed vs temporarily unavailable. 4. WebSocket HandlingI notice WebSocket connections have their own close mechanism (tunnel.ts:203): Is there a similar scenario for WebSockets that should also return 503? Or does the WebSocket protocol handle this differently? (Not blocking, just curious about consistency) Code Quality AssessmentStyle: Follows TypeScript conventions RecommendationApprove with suggestions. The core change is correct and valuable. Primary recommendation is to add test coverage for this retry scenario to prevent regressions. Additional Questions for Author
Generated with Claude Code |

No description provided.