LLM Tutorial

LLM tutorial materials include but not limited to NVIDIA NeMo, TensorRT-LLM, Triton Inference Server, and NeMo Guardrails.

This material is used in the NCHC LLM Bootcamp.

Running on TWCC

Please follow this TWCC README to run the tutorials on TWCC.

Running Locally

Install docker and nvidia container toolkit. Then add your user to the docker group and re-login/restart.

git clone https://github.com/j3soon/LLM-Tutorial.git
cd LLM-Tutorial

# (a) NeMo
docker run --rm -it --gpus=all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -v $(pwd)/workspace:/workspace --network=host nvcr.io/nvidia/nemo:24.05
# in the container
jupyter lab
# open the notebook URL in your browser

# (b) TensorRT-LLM
docker run --rm -it --gpus=all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -v $(pwd)/workspace:/workspace --network=host nvcr.io/nvidia/tritonserver:24.05-trtllm-python-py3
# in the container
jupyter lab
# open the notebook URL in your browser

Contributing

Make sure to run the following before committing:

pip install nb-clean
nb-clean clean workspace/NeMo_Training_TinyLlama.ipynb
nb-clean clean workspace/TensorRT-LLM.ipynb
nb-clean clean workspace/NeMo_Guardrails.ipynb

Contributors

The code was primarily written by Cliff, with assistance from others listed in the contributor list.

Acknowledgements

We would like to thank NVIDIA, OpenACC, and NCHC (National Center for High-performance Computing) for making this bootcamp happen.

References

openhackathons-org/End-to-End-LLM

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

LLM Tutorial

Running on TWCC

Running Locally

Contributing

Contributors

Acknowledgements

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

LLM Tutorial

Running on TWCC

Running Locally

Contributing

Contributors

Acknowledgements

References