Skip to content

Latest commit

 

History

History
53 lines (35 loc) · 2.02 KB

README.md

File metadata and controls

53 lines (35 loc) · 2.02 KB

LLM Tutorial

LLM tutorial materials include but not limited to NVIDIA NeMo, TensorRT-LLM, Triton Inference Server, and NeMo Guardrails.

This material is used in the NCHC LLM Bootcamp.

Running on TWCC

Please follow this TWCC README to run the tutorials on TWCC.

Running Locally

Install docker and nvidia container toolkit. Then add your user to the docker group and re-login/restart.

git clone https://github.com/j3soon/LLM-Tutorial.git
cd LLM-Tutorial

# (a) NeMo
docker run --rm -it --gpus=all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -v $(pwd)/workspace:/workspace --network=host nvcr.io/nvidia/nemo:24.05
# in the container
jupyter lab
# open the notebook URL in your browser

# (b) TensorRT-LLM
docker run --rm -it --gpus=all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -v $(pwd)/workspace:/workspace --network=host nvcr.io/nvidia/tritonserver:24.05-trtllm-python-py3
# in the container
jupyter lab
# open the notebook URL in your browser

Contributing

Make sure to run the following before committing:

pip install nb-clean
nb-clean clean workspace/NeMo_Training_TinyLlama.ipynb
nb-clean clean workspace/TensorRT-LLM.ipynb
nb-clean clean workspace/NeMo_Guardrails.ipynb

Contributors

The code was primarily written by Cliff, with assistance from others listed in the contributor list.

Acknowledgements

We would like to thank NVIDIA, OpenACC, and NCHC (National Center for High-performance Computing) for making this bootcamp happen.

References