PyTriton Torch linear model deployment

This example, shows how to optimize a simple linear model and deploy it to PyTriton.

Requirements

The example requires the torch package. It can be installed in your current environment using pip:

pip install torch

Or you can use NVIDIA Torch container:

docker run -it --gpus 1 --shm-size 8gb -v ${PWD}:${PWD} -w ${PWD} nvcr.io/nvidia/pytorch:23.01-py3 bash

If you select to use container, we recommend installing NVIDIA Container Toolkit.

Install the Triton Model Navigator following the installation guide for Torch:

pip install --extra-index-url https://pypi.ngc.nvidia.com .[torch]

Note: run this command from main catalog inside the repository

In the next step, the optimize process will be performed for the model.

python examples/triton/optimize.py

Once the process is done, the linear.nav package is created in the current working directory.

This step starts PyTriton server with the package generated in the previous step.

./serve.py

Use client to test model deployment on PyTriton

./client.py