Why is BaseChatOpenAI streaming the "get_final_completion" as a chunk? #29640
-
Checked other resources
Commit to Help
Example Codechain = self.model.bind(
response_format=oai_response_format,
tools=oai_tools,
parallel_tool_calls=False) | JsonOutputParser() Description
AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_0xRjVSsqNJElRTnw357Z2CwIcall_0xRjVSsqNJElRTnw357Z2CwI', 'function': {'arguments': '{"location":"New York"}{"location":"New York"}', 'name': 'get_weatherget_weather', 'parsed_arguments': {'location': 'New York'}}, 'type': 'function'}], 'parsed': None, 'refusal': None}, response_metadata={'finish_reason': 'tool_calls', 'token_usage': None, 'model_name': '', 'system_fingerprint': 'fp_f3927aa00d', 'prompt_filter_results': [{'prompt_index': 0, 'content_filter_results': {}}]}, id='run-8ec9c1fe-4c25-42e6-a2c0-4b4e53ccee46', tool_calls=[{'name': 'get_weatherget_weather', 'args': {'location': 'New York'}, 'id': 'call_0xRjVSsqNJElRTnw357Z2CwIcall_0xRjVSsqNJElRTnw357Z2CwI', 'type': 'tool_call'}], tool_call_chunks=[{'name': 'get_weatherget_weather', 'args': '{"location":"New York"}{"location":"New York"}', 'id': 'call_0xRjVSsqNJElRTnw357Z2CwIcall_0xRjVSsqNJElRTnw357Z2CwI', 'index': 0, 'type': 'tool_call_chunk'}]) The place this is sent in if hasattr(response, "get_final_completion") and "response_format" in payload:
final_completion = await response.get_final_completion()
generation_chunk = self._get_generation_chunk_from_completion(
final_completion
)
if run_manager:
await run_manager.on_llm_new_token(
generation_chunk.text, chunk=generation_chunk
)
yield generation_chunk Is there a proposed way to determine that final chunk? It seems to me this shouldn't be a chunk, but a message directly, as it is complete. This change was introduced here: #29044 System InfoSystem Information
Package Information
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Thanks for raising this. I believe the bug was resolved in #29649. There are a few options for how we stream structured output with OpenAI:
(2) in my opinion is not an obvious slam dunk as there are some trade-offs, so will keep as-is for now but please let me know if there are other options or you have additional thoughts. |
Beta Was this translation helpful? Give feedback.
Thanks for raising this. I believe the bug was resolved in #29649.
There are a few options for how we stream structured output with OpenAI:
Stream chunks with json string content, with a final chunk containing the
parsed
Pydantic object. Obtain this parsed object usingget_final_completion
. This is what is implemented now and what is demonstrated in OpenAI's docs. The downside of this as you found is it erroneously doubles tool calls when we simultaneously stream tool calls + structured output (this particular bug is now fixed).Stream chunks with json string content, with a final chunk containing the
parsed
Pydantic object. Obtain this parsed object during the stream from thecontent…