Skip to content

Conversation

@luispater
Copy link

No description provided.

hkfires and others added 19 commits December 10, 2025 22:19
fix(claude): prevent final events when no content streamed
NormalizeThinkingModel now checks ModelSupportsThinking before removing
"-thinking" or "-thinking-<ver>", avoiding accidental parsing of model
names where the suffix is part of the official id (e.g., kimi-k2-thinking,
qwen3-235b-a22b-thinking-2507).

The registry adds ThinkingSupport metadata for several models and
propagates it via ModelInfo (e.g., kimi-k2-thinking, deepseek-r1,
qwen3-235b-a22b-thinking-2507, minimax-m2), enabling accurate detection
of thinking-capable models and correcting base model inference.
Align thinking suffix handling on a single bracket-style marker.

NormalizeThinkingModel strips a terminal `[value]` segment from
model identifiers and turns it into either a thinking budget (for
numeric values) or a reasoning effort hint (for strings). Emission
of `ThinkingIncludeThoughtsMetadataKey` is removed.

Executor helpers and the example config are updated so their
comments reference the new `[value]` suffix format instead of the
legacy dash variants.

BREAKING CHANGE: dash-based thinking suffixes (`-thinking`,
`-thinking-N`, `-reasoning`, `-nothinking`) are no longer parsed
for thinking metadata; only `[value]` annotations are recognized.
fix(amp): set status on claude stream errors
fix: increase buffer size for stream scanners to 50MB across multiple executors
@gemini-code-assist
Copy link

Summary of Changes

Hello @luispater, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces significant improvements to how 'thinking' capabilities are managed and applied across various AI models, including Claude, OpenAI, and iFlow. It standardizes the configuration of these features through a new model naming convention and introduces support for discrete reasoning effort levels. Alongside these functional enhancements, the PR boosts the robustness of streaming operations by increasing internal buffer sizes and refines the handling of streaming events and errors, ensuring a more consistent and reliable user experience.

Highlights

  • Unified Thinking Configuration: Implemented a standardized approach for configuring 'thinking' capabilities across various AI models (Claude, OpenAI, iFlow) using a new (<value>) suffix pattern for model names and introducing discrete reasoning effort levels.
  • Increased Stream Buffer Sizes: Enhanced streaming performance and stability by increasing the buffer size for bufio.Scanner in multiple executors and translators from 20MB to 50MB.
  • Refined Claude Thinking Logic: Centralized Claude-specific thinking configuration logic into a new utility file (claude_thinking.go) and integrated it with the unified thinking resolution, removing older suffix-based logic.
  • Improved Streaming Event Handling: Modified Claude response translators to only emit final streaming events (message_stop, content_block_stop, message_delta) if actual content (text, thinking, or tool use) has been generated, preventing unnecessary empty events.
  • Enhanced Error Reporting: Added HTTP status code setting and explicit flusher calls for streaming error events in Claude code handlers to provide more accurate and timely error responses.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces level-based "thinking" support for various models, refactors how thinking configuration is specified and handled, increases stream buffer sizes, and fixes some bugs related to streaming empty responses and error handling. The changes are well-structured, particularly the centralization of logic into the util package. My feedback focuses on improving maintainability by reducing code duplication for ThinkingSupport structs and buffer sizes.

Comment on lines 496 to +499
ContextLength: 400000,
MaxCompletionTokens: 128000,
SupportedParameters: []string{"tools"},
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high"}},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There are multiple instances of &ThinkingSupport{Levels: []string{"low", "medium", "high"}} and other similar ThinkingSupport structs being created. To improve maintainability and reduce duplication, consider defining these common configurations as package-level constants or variables.

For example:

var thinkingLevelsLowMediumHigh = &ThinkingSupport{Levels: []string{"low", "medium", "high"}}

// ... in GetOpenAIModels()
{
    ID:                  "gpt-5-codex",
    // ...
    Thinking:            thinkingLevelsLowMediumHigh,
},

This would apply to several other model definitions in this file as well.

if from == to {
scanner := bufio.NewScanner(decodedBody)
scanner.Buffer(nil, 20_971_520)
scanner.Buffer(nil, 52_428_800) // 50MB

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The buffer size 52_428_800 is hardcoded here and in several other executors within this package. A constant streamScannerBuffer is defined in antigravity_executor.go but is not used. To avoid magic numbers and improve maintainability, please use this constant.

This comment also applies to codex_executor.go, gemini_cli_executor.go, gemini_executor.go, gemini_vertex_executor.go, iflow_executor.go, and openai_compat_executor.go.

Suggested change
scanner.Buffer(nil, 52_428_800) // 50MB
scanner.Buffer(nil, streamScannerBuffer)

Comment on lines +334 to +335
buffer := make([]byte, 52_428_800) // 50MB
scanner.Buffer(buffer, 52_428_800)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The buffer size 52_428_800 is hardcoded. This value is also used in other packages (e.g., executor) and other files in this package (e.g., claude_openai-responses_response.go). To ensure consistency and improve maintainability, it would be beneficial to define this as a shared constant in a utility package (e.g., in internal/util).

For now, defining a local constant would be an improvement:

const streamScannerBuffer = 52_428_800 // 50MB
// ...
buffer := make([]byte, streamScannerBuffer)
scanner.Buffer(buffer, streamScannerBuffer)

@luispater luispater merged commit 242b4d5 into router-for-me:main Dec 11, 2025
1 check failed
@luispater luispater deleted the plus branch December 11, 2025 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants