-
Notifications
You must be signed in to change notification settings - Fork 1.2k
add reasoning content to ChatCompletions #494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -48,6 +48,9 @@ async def handle_stream( | |
usage: CompletionUsage | None = None | ||
state = StreamingState() | ||
|
||
is_reasoning_model = False | ||
emit_reasoning_content = False | ||
emit_content = False | ||
async for chunk in stream: | ||
if not state.started: | ||
state.started = True | ||
|
@@ -62,9 +65,16 @@ async def handle_stream( | |
continue | ||
|
||
delta = chunk.choices[0].delta | ||
reasoning_content = None | ||
content = None | ||
if hasattr(delta, "reasoning_content"): | ||
reasoning_content = delta.reasoning_content | ||
is_reasoning_model = True | ||
if hasattr(delta, "content"): | ||
content = delta.content | ||
|
||
# Handle text | ||
if delta.content: | ||
if reasoning_content or content: | ||
if not state.text_content_index_and_output: | ||
# Initialize a content tracker for streaming text | ||
state.text_content_index_and_output = ( | ||
|
@@ -100,16 +110,59 @@ async def handle_stream( | |
), | ||
type="response.content_part.added", | ||
) | ||
# Emit the delta for this segment of content | ||
yield ResponseTextDeltaEvent( | ||
content_index=state.text_content_index_and_output[0], | ||
delta=delta.content, | ||
item_id=FAKE_RESPONSES_ID, | ||
output_index=0, | ||
type="response.output_text.delta", | ||
) | ||
# Accumulate the text into the response part | ||
state.text_content_index_and_output[1].text += delta.content | ||
|
||
if reasoning_content is not None: | ||
if not emit_reasoning_content: | ||
emit_reasoning_content = True | ||
|
||
reasoning_content_title = "# reasoning content\n\n" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this doesn't seem right - why hardcode? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's a markdown title for splitting the content and reasoning content. It's a constant value so have to hardcode. The whole output are like below:
Another way is use
Which way do you prefer? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think neither? IMO it would be better to emit a separate item for reasoning. For example, I was trying something like this in #581. What do you think? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I agree with you.
So we can emit ResponseReasoningSummaryTextDeltaEvent for reasoning content or create some class like ResponseReasoningTextDeltaEvent in this repo? |
||
# Emit the reasoning content title | ||
yield ResponseTextDeltaEvent( | ||
content_index=state.text_content_index_and_output[0], | ||
delta=reasoning_content_title, | ||
item_id=FAKE_RESPONSES_ID, | ||
output_index=0, | ||
type="response.output_text.delta", | ||
) | ||
# Accumulate the text into the response part | ||
state.text_content_index_and_output[1].text += reasoning_content_title | ||
|
||
# Emit the delta for this segment of content | ||
yield ResponseTextDeltaEvent( | ||
content_index=state.text_content_index_and_output[0], | ||
delta=reasoning_content, | ||
item_id=FAKE_RESPONSES_ID, | ||
output_index=0, | ||
type="response.output_text.delta", | ||
) | ||
# Accumulate the text into the response part | ||
state.text_content_index_and_output[1].text += reasoning_content | ||
|
||
if content is not None: | ||
if not emit_content and is_reasoning_model: | ||
emit_content = True | ||
content_title = "\n\n# content\n\n" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same here? |
||
# Emit the content title | ||
yield ResponseTextDeltaEvent( | ||
content_index=state.text_content_index_and_output[0], | ||
delta=content_title, | ||
item_id=FAKE_RESPONSES_ID, | ||
output_index=0, | ||
type="response.output_text.delta", | ||
) | ||
# Accumulate the text into the response part | ||
state.text_content_index_and_output[1].text += content_title | ||
|
||
# Emit the delta for this segment of content | ||
yield ResponseTextDeltaEvent( | ||
content_index=state.text_content_index_and_output[0], | ||
delta=content, | ||
item_id=FAKE_RESPONSES_ID, | ||
output_index=0, | ||
type="response.output_text.delta", | ||
) | ||
# Accumulate the text into the response part | ||
state.text_content_index_and_output[1].text += content | ||
|
||
# Handle refusals (model declines to answer) | ||
if delta.refusal: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when would this be true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using reasoning model like deepseek-reasoner
deepseek reasoning model