You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(backend): Handle missing content in streaming delta (#316)
## Summary
This PR fixes a `KeyError: 'content'` that occurs when processing
streaming chat completions.
## Details
When using the `chat_completions` endpoint with `stream=True`, the final
`delta` chunk sent by the server may not contain a `content` key. This
is part of the standard API behavior to signal the end of the stream.
The existing code in `_extract_completions_delta_content` did not
account for this possibility and tried to access `delta['content']`
directly, leading to a `KeyError` and causing the benchmark process to
crash when the stream ended.
## Test Plan
This was discovered while running `guidellm benchmark` against an
OpenAI-compatible API endpoint (via `litellm`) that correctly implements
the streaming protocol.
guidellm benchmark \
--target "http://10.64.1.62:4000/v1" \
--model "qwen3-06b-2" \
--processor "Qwen/Qwen3-0.6B" \
--rate-type "synchronous" \
--max-requests 1 \
--data "prompt_tokens=32,output_tokens=32,samples=1"
## Related Issues
#315
- Resolves #
---
- [x] "I certify that all code in this PR is my own, except as noted
below."
## Use of AI
- [ ] Includes AI-assisted code completion
- [ ] Includes code generated by an AI application
- [ ] Includes AI-generated tests (NOTE: AI written tests should have a
docstring that includes `## WRITTEN BY AI ##`)
Signed-off-by: xinjun.jiang <[email protected]>
Co-authored-by: Samuel Monson <[email protected]>
0 commit comments