Important
Development is still in progress for several project components. See the notes below for which workflows are best supported.
The shortfin sub-project is SHARK's high performance inference library and serving engine.
- API documentation for shortfin is available on readthedocs.
The SHARK Tank sub-project contains a collection of model recipes and conversion tools to produce inference-optimized programs.
Warning
SHARK Tank is still under development. Experienced users may want to try it out, but we currently recommend most users download pre-exported or pre-compiled model files for serving with shortfin.
- See the SHARK Tank Programming Guide for information about core concepts, the development model, dataset management, and more.
- See Direct Quantization with SHARK Tank for information about quantization support.
The Tuner sub-project assists with tuning program performance by searching for optimal parameter configurations to use during model compilation.
Model name | Model recipes | Serving apps |
---|---|---|
SDXL | sharktank/sharktank/models/punet/ |
shortfin/python/shortfin_apps/sd/ |
llama | sharktank/sharktank/models/llama/ |
shortfin/python/shortfin_apps/llm/ |
Each sub-project has its own developer guide. If you would like to work across projects, these instructions should help you get started:
We recommend setting up a Python
virtual environment (venv).
The project is configured to ignore .venv
directories, and editors like
VSCode pick them up by default.
python -m venv .venv
source .venv/bin/activate
If no explicit action is taken, the default PyTorch version will be installed. This will give you a current CUDA-based version, which takes longer to download and includes other dependencies that SHARK does not require. To install a different variant, run one of these commands first:
-
CPU:
pip install -r pytorch-cpu-requirements.txt
-
ROCM:
pip install -r pytorch-rocm-requirements.txt
-
Other: see instructions at https://pytorch.org/get-started/locally/.
# Install editable local projects.
pip install -r requirements.txt -e sharktank/ shortfin/
# Optionally clone and install the latest editable iree-turbine dep in deps/,
# along with nightly versions of iree-base-compiler and iree-base-runtime.
pip install -f https://iree.dev/pip-release-links.html --upgrade --pre \
iree-base-compiler iree-base-runtime --src deps \
-e "git+https://github.com/iree-org/iree-turbine.git#egg=iree-turbine"
See also: docs/nightly_releases.md
.
pytest sharktank
pytest shortfin
This project is set up to use the pre-commit
tooling. To install it in
your local repo, run: pre-commit install
. After this point, when making
commits locally, hooks will run. See https://pre-commit.com/