Fix: Map finish_reason for LiteLLM streaming responses #3677

thesynapses · 2025-11-23T14:35:19Z

Streaming responses from LiteLLM models (Claude, GPT, etc.) were not setting finish_reason on aggregated LlmResponse objects, causing agent runners to not properly recognize completion states.

This fix mirrors the finish_reason mapping logic from the non-streaming path (lines 776-784) and applies it to both streaming code paths:

Tool call responses (lines 1340-1368)
Text-only responses (lines 1369-1390)

Without this fix, agents using Claude or GPT via LiteLLM would encounter stop conditions that couldn't be properly handled, leading to incomplete responses or unexpected agent behavior.

Tested with Claude Sonnet 4.5 and GPT-5 via Azure OpenAI in production multi-agent system with MCP tools.

Link to Issue or Description of Change

1. Link to an existing issue:

Closes: LiteLLM Streaming Responses Missing finish_reason in ADK #3665

Testing Plan

Problem:
When using LiteLLM models in streaming mode, the finish_reason field was never set on aggregated LlmResponse objects. This caused ADK agent runners to not properly detect when responses completed, leading to incomplete responses, agents not recognizing stop conditions, and unpredictable behavior with Claude/GPT models.

Solution:
Added finish_reason mapping in both streaming code paths (tool calls and text-only), mirroring the existing non-streaming implementation at lines 776-784. Maps LiteLLM's string finish reasons ("stop", "tool_calls", etc.) to ADK's types.FinishReason enum values using the existing _FINISH_REASON_MAPPING dictionary.

Unit Tests:

I have added or updated unit tests for my change.
All unit tests pass locally.

Manual End-to-End (E2E) Tests:

Setup:

Multi-agent system with ADK 1.19.0 + LiteLLM wrapper
Claude Sonnet 4.5 via Vertex AI (vertex_ai/claude-sonnet-4-5@20250929)
GPT-5 via Azure OpenAI (azure/gpt-5-openai-latest)
Streaming SSE mode with progressive chunk delivery
MCP tools connected via Gluon Link: Google Drive agent, HubSpot CRM agent

Test Cases:

Google Drive: File listing with formatted markdown table output
HubSpot CRM: Company queries with structured data
Multi-turn conversations with tool calls

Before Fix:

finish_reason field was None on streaming responses
Agents showed incomplete/truncated responses
Claude: 291 tokens delivered, response cut off mid-table
GPT-5: 682 tokens but inconsistent completion detection

After Fix:

finish_reason correctly set to types.FinishReason.STOP
Complete responses delivered to users
Claude: Full markdown tables rendered properly (759 chars)
GPT-5: Consistent completion with proper finish_reason
Both models reliably signal completion states

Log Evidence:

# After fix - GPT-5 example:
INFO | GPT-5 streaming completed. Events processed: 15
INFO | Usage: prompt=123 tokens, candidates=682 tokens, finish_reason=STOP

Checklist

I have read the CONTRIBUTING.md document.
I have performed a self-review of my own code.
I have commented my code, particularly in hard-to-understand areas.
I have added tests that prove my fix is effective or that my feature works.
New and existing unit tests pass locally with my changes.
I have manually tested my changes end-to-end.
Any dependent changes have been merged and published in downstream modules.

Additional context

This fix is critical for production systems using any LiteLLM-supported models (Claude, GPT, Mistral, etc.) in streaming mode. The bug affects all streaming scenarios where the ADK agent runner needs to detect proper completion. The fix ensures consistent behavior between streaming and non-streaming modes, making LiteLLM a viable production option for multi-agent systems.

Related to issue #3676 (double serialization) - both bugs prevented proper Claude/GPT operation with ADK.

Fixes google#3665 Streaming responses from LiteLLM models (Claude, GPT, etc.) were not setting finish_reason on aggregated LlmResponse objects, causing agent runners to not properly recognize completion states. This fix mirrors the finish_reason mapping logic from the non-streaming path (lines 776-784) and applies it to both streaming code paths: - Tool call responses (lines 1340-1368) - Text-only responses (lines 1369-1390) Without this fix, agents using Claude or GPT via LiteLLM would encounter stop conditions that couldn't be properly handled, leading to incomplete responses or unexpected agent behavior. Tested with Claude Sonnet 4.5 and GPT-5 via Azure OpenAI in production multi-agent system with MCP tools.

gemini-code-assist · 2025-11-23T14:35:31Z

Summary of Changes

Hello @thesynapses, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical bug where LiteLLM streaming responses were failing to populate the finish_reason field in aggregated LlmResponse objects. This omission caused agent runners to misinterpret completion states, leading to functional issues in multi-agent systems. The solution involves integrating the established finish_reason mapping logic into the streaming pathways, ensuring that agents can reliably detect when a model's response has concluded, thereby improving the stability and correctness of agent interactions with streaming LLMs.

Highlights

Fix for LiteLLM Streaming Responses: Addressed an issue where streaming responses from LiteLLM models (e.g., Claude, GPT) were not correctly setting the finish_reason on aggregated LlmResponse objects.
Agent Runner Completion Recognition: The absence of finish_reason prevented agent runners from properly recognizing completion states, leading to incomplete responses or unexpected agent behavior.
Consistent Finish Reason Mapping: Implemented the finish_reason mapping logic, mirroring the existing non-streaming path, to ensure consistent behavior across both streaming and non-streaming modes.
Scope of Fix: The fix has been applied to both streaming code paths: for tool call responses and for text-only responses.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request correctly addresses a bug where finish_reason was not being mapped for streaming responses from LiteLLM, which could lead to incorrect agent behavior. The fix applies the existing mapping logic from the non-streaming path to both tool-calling and text-only streaming responses.

My review includes one suggestion to refactor the duplicated code into a helper function. This will improve the code's maintainability by adhering to the DRY principle. Overall, this is a good fix that improves the robustness of the LiteLLM integration.

adk-bot · 2025-11-23T14:36:13Z

Response from ADK Triaging Agent

Hello @thesynapses, thank you for creating this PR!

Could you please fill out the Testing Plan section in your PR description? This information will help reviewers to review your PR more efficiently. Thanks!

gemini-code-assist bot reviewed Nov 23, 2025

View reviewed changes

adk-bot added the models [Component] Issues related to model support label Nov 23, 2025

ryanaiagent self-assigned this Nov 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Map finish_reason for LiteLLM streaming responses #3677

Fix: Map finish_reason for LiteLLM streaming responses #3677

thesynapses commented Nov 23, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Nov 23, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

adk-bot commented Nov 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix: Map finish_reason for LiteLLM streaming responses #3677

Are you sure you want to change the base?

Fix: Map finish_reason for LiteLLM streaming responses #3677

Conversation

thesynapses commented Nov 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Link to Issue or Description of Change

Testing Plan

Checklist

Additional context

Uh oh!

gemini-code-assist bot commented Nov 23, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

adk-bot commented Nov 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

thesynapses commented Nov 23, 2025 •

edited

Loading