Skip to content

Python backend model missing model.py remains stuck in LOADING and cannot be unloaded #8766

@aniket-wbd

Description

@aniket-wbd

Is your feature request related to a problem? Please describe.

When using Triton Inference Server with --model-control-mode=explicit and the Python backend, a Python model version whose repository directory is missing model.py can remain stuck in LOADING.

In our test, the model repository contained config.pbtxt and a version directory, but the version directory did not contain model.py. After calling the repository load API:

POST /v2/repository/models/<model_name>/load

Triton logged:

Failed to preinitialize Python stub: Python model file not found in '<model_repo>/<model_name>/<version>/model.py'

However, POST /v2/repository/index continued to report the model version as LOADING instead of transitioning to UNAVAILABLE/error with a reason. Retrying load after adding model.py did not recover the model, and unload also did not clear the stuck state. The practical recovery was restarting the Triton server/pod.

This is different from other Python backend failures we tested, such as empty model.py, missing imports, syntax errors, or exceptions in initialize(). Those returned HTTP 400 from /load and showed UNAVAILABLE with a useful reason in the repository index.

Describe the solution you'd like

For missing model.py and similar Python backend preinitialization failures, Triton should transition the model version from LOADING to UNAVAILABLE or another terminal error state, with the failure reason visible in repository/index.

It would also help to have a supported recovery API for a model stuck in LOADING, such as:

  • allowing unload to clear a stuck LOADING model/version, or
  • exposing an admin/repository API to force a failed terminal state for a stuck load.

Describe alternatives you've considered

Our current workaround is to add preflight validation before calling Triton /load, specifically checking that Python backend model artifacts contain model.py.

If Triton still enters this stuck LOADING state, the only reliable recovery we found is restarting the affected Triton server/pod. This is operationally expensive because it can affect other loaded models on the same server.

Additional context

Environment tested:

  • Triton image: nvcr.io/nvidia/tritonserver:24.10-pyt-python-py3
  • Triton server version: 2.51.0
  • Python backend stub linked to Python 3.10
  • Model control mode: explicit
  • Backend: python

Observed behavior:

  • Missing model.py: /load did not produce a clean terminal UNAVAILABLE state; repository index stayed LOADING.
  • Empty model.py: /load returned HTTP 400; repository index showed UNAVAILABLE with AttributeError.
  • Missing import: /load returned HTTP 400; repository index showed UNAVAILABLE with ModuleNotFoundError.
  • Syntax error: /load returned HTTP 400; repository index showed UNAVAILABLE with SyntaxError.
  • initialize() exception: /load returned HTTP 400; repository index showed UNAVAILABLE with the exception reason.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions