-
-
Notifications
You must be signed in to change notification settings - Fork 59
v6.6.1 #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v6.6.1 #19
Conversation
…s multiple executors
fix(claude): prevent final events when no content streamed
NormalizeThinkingModel now checks ModelSupportsThinking before removing "-thinking" or "-thinking-<ver>", avoiding accidental parsing of model names where the suffix is part of the official id (e.g., kimi-k2-thinking, qwen3-235b-a22b-thinking-2507). The registry adds ThinkingSupport metadata for several models and propagates it via ModelInfo (e.g., kimi-k2-thinking, deepseek-r1, qwen3-235b-a22b-thinking-2507, minimax-m2), enabling accurate detection of thinking-capable models and correcting base model inference.
Align thinking suffix handling on a single bracket-style marker. NormalizeThinkingModel strips a terminal `[value]` segment from model identifiers and turns it into either a thinking budget (for numeric values) or a reasoning effort hint (for strings). Emission of `ThinkingIncludeThoughtsMetadataKey` is removed. Executor helpers and the example config are updated so their comments reference the new `[value]` suffix format instead of the legacy dash variants. BREAKING CHANGE: dash-based thinking suffixes (`-thinking`, `-thinking-N`, `-reasoning`, `-nothinking`) are no longer parsed for thinking metadata; only `[value]` annotations are recognized.
fix(amp): set status on claude stream errors
fix: increase buffer size for stream scanners to 50MB across multiple executors
Summary of ChangesHello @luispater, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces significant improvements to how 'thinking' capabilities are managed and applied across various AI models, including Claude, OpenAI, and iFlow. It standardizes the configuration of these features through a new model naming convention and introduces support for discrete reasoning effort levels. Alongside these functional enhancements, the PR boosts the robustness of streaming operations by increasing internal buffer sizes and refines the handling of streaming events and errors, ensuring a more consistent and reliable user experience. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces level-based "thinking" support for various models, refactors how thinking configuration is specified and handled, increases stream buffer sizes, and fixes some bugs related to streaming empty responses and error handling. The changes are well-structured, particularly the centralization of logic into the util package. My feedback focuses on improving maintainability by reducing code duplication for ThinkingSupport structs and buffer sizes.
| ContextLength: 400000, | ||
| MaxCompletionTokens: 128000, | ||
| SupportedParameters: []string{"tools"}, | ||
| Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high"}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are multiple instances of &ThinkingSupport{Levels: []string{"low", "medium", "high"}} and other similar ThinkingSupport structs being created. To improve maintainability and reduce duplication, consider defining these common configurations as package-level constants or variables.
For example:
var thinkingLevelsLowMediumHigh = &ThinkingSupport{Levels: []string{"low", "medium", "high"}}
// ... in GetOpenAIModels()
{
ID: "gpt-5-codex",
// ...
Thinking: thinkingLevelsLowMediumHigh,
},This would apply to several other model definitions in this file as well.
| if from == to { | ||
| scanner := bufio.NewScanner(decodedBody) | ||
| scanner.Buffer(nil, 20_971_520) | ||
| scanner.Buffer(nil, 52_428_800) // 50MB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The buffer size 52_428_800 is hardcoded here and in several other executors within this package. A constant streamScannerBuffer is defined in antigravity_executor.go but is not used. To avoid magic numbers and improve maintainability, please use this constant.
This comment also applies to codex_executor.go, gemini_cli_executor.go, gemini_executor.go, gemini_vertex_executor.go, iflow_executor.go, and openai_compat_executor.go.
| scanner.Buffer(nil, 52_428_800) // 50MB | |
| scanner.Buffer(nil, streamScannerBuffer) |
| buffer := make([]byte, 52_428_800) // 50MB | ||
| scanner.Buffer(buffer, 52_428_800) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The buffer size 52_428_800 is hardcoded. This value is also used in other packages (e.g., executor) and other files in this package (e.g., claude_openai-responses_response.go). To ensure consistency and improve maintainability, it would be beneficial to define this as a shared constant in a utility package (e.g., in internal/util).
For now, defining a local constant would be an improvement:
const streamScannerBuffer = 52_428_800 // 50MB
// ...
buffer := make([]byte, streamScannerBuffer)
scanner.Buffer(buffer, streamScannerBuffer)
No description provided.