Skip to content

Commit

Permalink
Use llama chat tags in example requests (#951)
Browse files Browse the repository at this point in the history
Use llama chat tags in example requests.

More details on this can be found
[here](#934 (comment))
  • Loading branch information
stbaione authored and eagarvey-amd committed Feb 13, 2025
1 parent 6621138 commit f7c0b3f
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions docs/shortfin/llm/user/llama_serving.md
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,7 @@ Next, let's send a generation request:
curl http://localhost:8000/generate \
-H "Content-Type: application/json" \
-d '{
"text": "Name the capital of the United States.",
"text": "<|begin_of_text|>Name the capital of the United States.<|eot_id|>",
"sampling_params": {"max_completion_tokens": 50}
}'
```
Expand All @@ -281,7 +281,7 @@ port = 8000 # Change if running on a different port
generate_url = f"http://localhost:{port}/generate"
def generation_request():
payload = {"text": "Name the capital of the United States.", "sampling_params": {"max_completion_tokens": 50}}
payload = {"text": "<|begin_of_text|>Name the capital of the United States.<|eot_id|>", "sampling_params": {"max_completion_tokens": 50}}
try:
resp = requests.post(generate_url, json=payload)
resp.raise_for_status() # Raises an HTTPError for bad responses
Expand Down

0 comments on commit f7c0b3f

Please sign in to comment.