You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 10, 2024. It is now read-only.
This is a hot fix so users can specify checkpoint paths to object stores rather than it being hardcoded to llm-atc since s3 buckets have to be globally unique.
This patch includes some bugfixes as well enabling passing huggingface tokens to access gated/private models for serving and training. This update also enables tensor parallelism on all gpus of a given model to enable serving of larger models like llama-70b on a multigpu instance.
I promise to write a detailed changelog coming up in v0.1.4!