vLLM Emulator

Emulates a vLLM-served LLM, providing mock /v1/completions and /v1/chat/completions endpoints

Run locally

pip3 install -r requirements.txt
# OR uv venv; uv sync
fastapi dev vllm_emulator.py

then you can curl the "model" as if it were an LLM, e.g.,:

curl --request POST \
  --url http://localhost:8000/v1/chat/completions \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "vllm-runtime-cpu-fp16",
  "messages": [
    {
      "role": "user",
      "content": "What is the opposite of down?"
    }
  ],
  "temperature": 0,
  "logprobs": true,
  "max_tokens": 500
}'

For the model argument, any string is accepted by the endpoint.

OpenShift Usage

oc apply -f deployment.yaml

This creates a service and route that can be used inside lm-eval, e.g.:

apiVersion: trustyai.opendatahub.io/v1alpha1
kind: LMEvalJob
metadata:
  name: evaljob
spec:
  model: local-completions
  taskList:
    taskNames:
      - arc_easy
  logSamples: true
  batchSize: "1"
  allowOnline: true
  allowCodeExecution: false
  outputs:
    pvcManaged:
      size: 5Gi
  modelArgs:
    - name: model
      value: emulatedModel
    - name: base_url
      value: http://vllm-emulator-service:8000/v1/completions
    - name: num_concurrent
      value: "1"
    - name: max_retries
      value: "3"
    - name: tokenized_requests
      value: "False"
    - name: tokenizer
      value: ibm-granite/granite-guardian-3.1-8b # this isn't used, but we need some valid value here

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
README.md		README.md
api.py		api.py
bigrams.json		bigrams.json
deployment.yaml		deployment.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
text_generation.py		text_generation.py
uv.lock		uv.lock
vllm_emulator.py		vllm_emulator.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

vLLM Emulator

Run locally

OpenShift Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

trustyai-explainability/vllm_emulator

Folders and files

Latest commit

History

Repository files navigation

vLLM Emulator

Run locally

OpenShift Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages