-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Open
Description
Checklist
- If this is not a feature request but a general question, please start a discussion at https://github.com/sgl-project/sglang/discussions. Otherwise, it will be closed.
- Please use English. Otherwise, it will be closed.
Motivation
We're testing the /v1/score API for multi-candidate scoring (great work, btw!), and part of our stack is automatically collecting token counts for all of our inputs.
Currently (as of 28e2340), the scoring API returns null usage field:
$ curl -X POST "http://localhost:8000/v1/score" \
-H "Content-Type: application/json" \
-d '{
"query": "Is the following city the capital of California? Answer No or Yes only. City: ",
"items": ["Austin", "San Jose", "San Francisco"],
"label_token_ids": [9454, 2753],
"model": "qwen3-06b" }' | jq
Returns:
{
"scores": [..],
"model": "qwen3-06b",
"usage": null,
"object": "scoring"
}
Ideally we could get the standard usage block returned, e.g.:
"usage": {
"prompt_tokens": 15,
"total_tokens": 15,
"completion_tokens": 0,
"prompt_tokens_details": null,
"reasoning_tokens": 0
},
Is this a feature you'd be interested in? I'm happy to put up a PR if so.
Related resources
No response
Metadata
Metadata
Assignees
Labels
No labels