Python backend model missing model.py remains stuck in LOADING and cannot be unloaded

**Is your feature request related to a problem? Please describe.**

When using Triton Inference Server with `--model-control-mode=explicit` and the Python backend, a Python model version whose repository directory is missing `model.py` can remain stuck in `LOADING`.

In our test, the model repository contained `config.pbtxt` and a version directory, but the version directory did not contain `model.py`. After calling the repository load API:

`POST /v2/repository/models/<model_name>/load`

Triton logged:

`Failed to preinitialize Python stub: Python model file not found in '<model_repo>/<model_name>/<version>/model.py'`

However, `POST /v2/repository/index` continued to report the model version as `LOADING` instead of transitioning to `UNAVAILABLE`/error with a reason. Retrying `load` after adding `model.py` did not recover the model, and `unload` also did not clear the stuck state. The practical recovery was restarting the Triton server/pod.

This is different from other Python backend failures we tested, such as empty `model.py`, missing imports, syntax errors, or exceptions in `initialize()`. Those returned `HTTP 400` from `/load` and showed `UNAVAILABLE` with a useful reason in the repository index.

**Describe the solution you'd like**

For missing `model.py` and similar Python backend preinitialization failures, Triton should transition the model version from `LOADING` to `UNAVAILABLE` or another terminal error state, with the failure reason visible in `repository/index`.

It would also help to have a supported recovery API for a model stuck in `LOADING`, such as:

- allowing `unload` to clear a stuck `LOADING` model/version, or
- exposing an admin/repository API to force a failed terminal state for a stuck load.

**Describe alternatives you've considered**

Our current workaround is to add preflight validation before calling Triton `/load`, specifically checking that Python backend model artifacts contain `model.py`.

If Triton still enters this stuck `LOADING` state, the only reliable recovery we found is restarting the affected Triton server/pod. This is operationally expensive because it can affect other loaded models on the same server.

**Additional context**

Environment tested:

- Triton image: `nvcr.io/nvidia/tritonserver:24.10-pyt-python-py3`
- Triton server version: `2.51.0`
- Python backend stub linked to Python `3.10`
- Model control mode: `explicit`
- Backend: `python`

Observed behavior:

- Missing `model.py`: `/load` did not produce a clean terminal `UNAVAILABLE` state; repository index stayed `LOADING`.
- Empty `model.py`: `/load` returned `HTTP 400`; repository index showed `UNAVAILABLE` with `AttributeError`.
- Missing import: `/load` returned `HTTP 400`; repository index showed `UNAVAILABLE` with `ModuleNotFoundError`.
- Syntax error: `/load` returned `HTTP 400`; repository index showed `UNAVAILABLE` with `SyntaxError`.
- `initialize()` exception: `/load` returned `HTTP 400`; repository index showed `UNAVAILABLE` with the exception reason.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python backend model missing model.py remains stuck in LOADING and cannot be unloaded #8766

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Python backend model missing model.py remains stuck in LOADING and cannot be unloaded #8766

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions