-
Notifications
You must be signed in to change notification settings - Fork 578
UN-1824 [FIX] Return HTTP 409 when tool image not found in container registry #1757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…calls inside the try block in runner.py
for more information, see https://pre-commit.ci
Summary by CodeRabbitRelease Notes
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdd end-to-end detection and propagation of "tool image not found in registry" across API, runner, tool-sandbox, and workflow layers; introduce new exceptions and error_code field, map these conditions to a 500 server error, and adjust execution/status handling and metadata pruning. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant API as API Deployment
participant Sandbox as Tool-Sandbox
participant Runner as Runner/Docker
participant Registry as Container Registry
Client->>API: POST execution request
activate API
API->>Sandbox: request tool run
activate Sandbox
Sandbox->>Runner: HTTP request to runner
activate Runner
Runner->>Registry: pull tool image
activate Registry
Registry-->>Runner: ImageNotFound / 404 / pull stream error
deactivate Registry
Runner->>Runner: raise ToolImageNotFoundError -> return structured error (error_code)
Runner-->>Sandbox: HTTP error with JSON including error_code
deactivate Runner
Sandbox->>API: raise ToolNotFoundInRegistryError (propagate)
deactivate Sandbox
API->>API: contains_tool_not_found_error() detects pattern
API-->>Client: 500 Internal Server Error with status/result
deactivate API
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@runner/src/unstract/runner/clients/docker_client.py`:
- Around line 177-205: The pull-stream error handling in the client.api.pull
loop currently raises ToolImageNotFoundError for any stream "error"; update the
logic in the loop that inspects line.get("error") so it discriminates based on
errorDetail and message: extract error_msg = line.get("error") and err_detail =
line.get("errorDetail", {}), then if err_detail.get("code") == 404 or "manifest
unknown" in error_msg.lower() or "not found" in error_msg.lower() raise
ToolImageNotFoundError(repository, image_tag); otherwise log the full error
(include error_msg and err_detail) and re-raise or propagate a generic exception
(so auth 401, rate-limit 429, network errors are not misclassified as
not-found); keep using image_name_with_tag, repository, image_tag and
ToolImageNotFoundError to locate the code to change.
🧹 Nitpick comments (1)
backend/api_v2/api_deployment_views.py (1)
54-88: Prefer expliciterror_codechecks over string matching.
Since downstream results now carryerror_code, checking it directly avoids brittle text matching and future message changes.♻️ Suggested refinement
- if isinstance(response, dict): - error = response.get("error") - result = response.get("result", []) - else: - error = getattr(response, "error", None) - result = getattr(response, "result", []) or [] + if isinstance(response, dict): + error = response.get("error") + error_code = response.get("error_code") + result = response.get("result", []) + else: + error = getattr(response, "error", None) + error_code = getattr(response, "error_code", None) + result = getattr(response, "result", []) or [] + + if error_code == ToolNotFoundInRegistry.ERROR_CODE: + return True ... - if isinstance(item, dict): - file_error = item.get("error", "") + if isinstance(item, dict): + if item.get("error_code") == ToolNotFoundInRegistry.ERROR_CODE: + return True + file_error = item.get("error", "")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the most part, do confirm on whether the 409 status code can be 500 here instead. Between
- runner -> backend, it can be 409
- backend -> user, it needs to be 500
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
|



What
Why
execution_status: "ERROR"in the response body, or HTTP 422 with a generic error messageHow
ToolImageNotFoundErrorexception in runner to catch Docker image pull failuresrunner.pyto properly catch the exception during image pullerror_codefield toRunnerContainerRunResponseDTO to propagate specific error typesToolNotFoundInRegistryErrorexception in tool-sandbox that's raised when error_code matchesToolNotFoundInRegistryAPI exception (HTTP 409) in backendapi_deployment_views.pyto detect "not found in container registry" pattern in both top-level and file-level errors_process_final_outputto preserve original error message when secondary exceptions occur during finalizationCan this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
Database Migrations
Env Config
Relevant Docs
Related Issues or PRs
Dependencies Versions
Notes on Testing
.env:Screenshots
Checklist
I have read and understood the Contribution Guidelines.