Skip to content

Latest commit

 

History

History

05_optimize_and_serve_model_on_pytriton

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

PyTriton Torch linear model deployment

This example, shows how to optimize a simple linear model and deploy it to PyTriton.

Requirements

The example requires the torch package. It can be installed in your current environment using pip:

pip install torch

Or you can use NVIDIA Torch container:

docker run -it --gpus 1 --shm-size 8gb -v ${PWD}:${PWD} -w ${PWD} nvcr.io/nvidia/pytorch:23.01-py3 bash

If you select to use container, we recommend installing NVIDIA Container Toolkit.

Install the Model Navigator

Install the Triton Model Navigator following the installation guide for Torch:

pip install --extra-index-url https://pypi.ngc.nvidia.com .[torch]

Note: run this command from main catalog inside the repository

Run model optimization

In the next step, the optimize process will be performed for the model.

python examples/triton/optimize.py

Once the process is done, the linear.nav package is created in the current working directory.

Start PyTriton server

This step starts PyTriton server with the package generated in the previous step.

./serve.py

Example PyTriton client

Use client to test model deployment on PyTriton

./client.py