Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
compose.yaml	compose.yaml

Name

Last commit message

Last commit date

vLLM

vLLM is a high-performance library for LLM inference and serving with OpenAI-compatible API.

Launching

To start the vLLM container, run

export HF_TOKEN=<your_huggingface_token_here>
podman compose up --detach

Once the container is running, you can access the OpenAI-compatible API at http://localhost:8000.

You can view the OpenAPI documentation at http://localhost:8000/docs.

To stop and remove the containers, run

podman compose down --volumes