Skip to content

Conversation

@Ravens2121
Copy link

PR Title / 拉取请求标题

feat(kiro): Add Thinking Mode support & enhance reliability with multi-quota failover
feat(kiro): 支持思考模型 (Thinking Mode) 并通过多配额故障转移增强稳定性


PR Description / 拉取请求描述

📝 Summary / 摘要

This PR introduces significant upgrades to the Kiro (AWS CodeWhisperer/Amazon Q) module. It adds native support for Thinking/Reasoning models (similar to OpenAI o1/Claude 3.7), implements a robust Multi-Endpoint Failover system to handle rate limits (429), and optimizes configuration flexibility.

本次 PR 对 Kiro (AWS CodeWhisperer/Amazon Q) 模块进行了重大升级。它增加了对 思考/推理模型 (Thinking/Reasoning models) 的原生支持(类似 OpenAI o1/Claude 3.7),实现了一套健壮的 多端点故障转移 (Multi-Endpoint Failover) 系统以应对速率限制 (429),并优化了配置灵活性。

✨ Key Changes / 主要变更

1. 🧠 Thinking Mode Support / 思考模式支持

  • OpenAI Compatibility: Automatically maps OpenAI's reasoning_effort parameter (low/medium/high) to Claude's budget_tokens (4k/16k/32k).
    • OpenAI 兼容性:自动将 OpenAI 的 reasoning_effort 参数(low/medium/high)映射为 Claude 的 budget_tokens(4k/16k/32k)。
  • Stream Parsing: Implemented advanced stream parsing logic to detect and extract content within <thinking>...</thinking> tags, even across chunk boundaries.
    • 流式解析:实现了高级流式解析逻辑,能够检测并提取 <thinking>...</thinking> 标签内的内容,即使标签跨越了数据块边界。
  • Protocol Translation: Converts Kiro's internal thinking content into OpenAI-compatible reasoning_content fields (for non-stream) or thinking_delta events (for stream).
    • 协议转换:将 Kiro 内部的思考内容转换为兼容 OpenAI 的 reasoning_content 字段(非流式)或 thinking_delta 事件(流式)。

2. 🛡️ Robustness & Failover / 稳健性与故障转移

  • Dual Quota System: Explicitly defined kiroEndpointConfig to distinguish between IDE (CodeWhisperer) and CLI (Amazon Q) quotas.
    • 双配额系统:显式定义了 kiroEndpointConfig 结构,明确区分 IDE (CodeWhisperer)CLI (Amazon Q) 的配额来源。
  • Auto Failover: Implemented automatic failover logic. If one endpoint returns 429 Too Many Requests, the request seamlessly retries on the next available endpoint/quota.
    • 自动故障转移:实现了自动故障转移逻辑。如果一个端点返回 429 Too Many Requests,请求将无缝在下一个可用端点/配额上重试。
  • Strict Protocol Compliance: Enforced strict matching of Origin and X-Amz-Target headers for each endpoint to prevent 403 Forbidden errors due to protocol mismatches.
    • 严格协议合规:强制每个端点严格匹配 OriginX-Amz-Target 头信息,防止因协议不匹配导致的 403 Forbidden 错误。

3. ⚙️ Configuration & Models / 配置与模型

  • New Config Options: Added KiroPreferredEndpoint (global) and PreferredEndpoint (per-key) settings to allow users to prioritize specific quotas (e.g., "ide" or "cli").
    • 新配置项:添加了 KiroPreferredEndpoint(全局)和 PreferredEndpoint(单 Key)设置,允许用户优先选择特定的配额(如 "ide" 或 "cli")。
  • Model Registry: Normalized model IDs (replaced dots with hyphens) and added -agentic variants optimized for large code generation tasks.
    • 模型注册表:规范化了模型 ID(将点号替换为连字符),并添加了针对大型代码生成任务优化的 -agentic 变体。

4. 🔧 Fixes / 修复

  • AMP Proxy: Downgraded client-side context cancellation logs from Error to Debug to reduce log noise.
    • AMP 代理:将客户端上下文取消的日志级别从 Error 降级为 Debug,减少日志噪音。

⚠️ Impact / 影响

  • Authentication: No changes to the login/OAuth process. Existing tokens work as is.
  • 认证:登录/OAuth 流程 无变更。现有 Token 可直接使用。
  • Compatibility: Fully backward compatible. The new failover logic is transparent to the user.
  • 兼容性:完全向后兼容。新的故障转移逻辑对用户是透明的。

@gemini-code-assist
Copy link

Summary of Changes

Hello @Ravens2121, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求对Kiro(AWS CodeWhisperer/Amazon Q)模块进行了重大升级,旨在增强其智能性和可靠性。它引入了对思考/推理模型的原生支持,允许模型在生成响应时展示其内部思考过程。同时,通过实现多端点故障转移系统,显著提升了服务面对速率限制时的稳定性。此外,还优化了配置灵活性,并对模型注册表进行了规范化,以更好地支持不同的使用场景和模型变体。这些改进在保持完全向后兼容性的同时,为用户提供了更强大、更稳定的AI辅助开发体验。

Highlights

  • 思考模式支持: 增加了对思考/推理模型的原生支持,包括将OpenAI的reasoning_effort参数(low/medium/high)自动映射到Claude的budget_tokens(4k/16k/32k),实现了高级流式解析逻辑以检测和提取<thinking>...</thinking>标签内的内容(即使跨越数据块边界),并将Kiro内部的思考内容转换为兼容OpenAI的reasoning_content字段(非流式)或thinking_delta事件(流式)。
  • 稳健性与故障转移: 实现了健壮的多端点故障转移系统,以应对速率限制(429)。明确定义了kiroEndpointConfig结构来区分IDE(CodeWhisperer)和CLI(Amazon Q)配额,并在一个端点返回429 Too Many Requests时,请求将无缝在下一个可用端点/配额上重试。同时,强制每个端点严格匹配OriginX-Amz-Target头信息,以防止因协议不匹配导致的403 Forbidden错误。
  • 配置与模型优化: 新增了KiroPreferredEndpoint(全局)和PreferredEndpoint(单Key)配置项,允许用户优先选择特定的配额(如“ide”或“cli”)。规范化了模型ID(将点号替换为连字符),并添加了针对大型代码生成任务优化的-agentic变体。
  • AMP代理日志修复: 将AMP代理中客户端上下文取消的日志级别从Error降级为Debug,以减少日志噪音。
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

这次的 PR 是一次意义重大且执行出色的升级。它为 Kiro 端点引入了健壮的故障转移逻辑,增加了对“思考模式”功能的支持,并优化了配置的灵活性。端点专属配置和自动重试/故障转移逻辑的实现尤其令人印象深刻,极大地增强了 Kiro 模块的稳定性。思考模式的协议转换也实现得很好。

我有几个建议,主要集中在通过减少冗余和清理一些数据结构来提高代码的可维护性。具体细节在各条评论中。总的来说,这是一次高质量的贡献。

// - Amazon Q endpoint (CLI origin) uses Amazon Q Developer quota
// - CodeWhisperer endpoint (AI_EDITOR origin) uses Kiro IDE quota
// Also supports multi-endpoint fallback similar to Antigravity implementation.
func (e *KiroExecutor) executeWithRetry(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options, accessToken, profileArn string, kiroPayload, body []byte, from, to sdktranslator.Format, reporter *usageReporter, currentOrigin, kiroModelID string, isAgentic, isChatOnly bool) (cliproxyexecutor.Response, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

executeWithRetry(以及 executeStreamWithRetry)的函数签名包含了 kiroPayloadcurrentOrigin 参数。然而,在函数开头的 for endpointIdx... 循环中,这些参数在被使用之前立即被覆盖了。

kiroPayload 在第 306 行被重新构建,currentOrigin 在第 302 行被重新赋值。这使得从调用方(ExecuteExecuteStream)传入的参数变得多余。

为了提高代码清晰度并简化代码,建议进行重构。你可以从 executeWithRetryexecuteStreamWithRetry 的函数签名中移除 kiroPayloadcurrentOrigin 参数。这些变量可以在这些函数内部声明为局部变量,并且可以移除 ExecuteExecuteStream 中现在多余的、初始的 payload 创建逻辑。

Comment on lines +913 to +922
"claude-opus-4-5": "claude-opus-4.5",
"claude-opus-4.5": "claude-opus-4.5",
"claude-haiku-4-5": "claude-haiku-4.5",
"claude-haiku-4.5": "claude-haiku-4.5",
"claude-sonnet-4-5": "claude-sonnet-4.5",
"claude-sonnet-4-5-20250929": "claude-sonnet-4.5",
"claude-sonnet-4.5": "claude-sonnet-4.5",
"claude-sonnet-4": "claude-sonnet-4",
"claude-sonnet-4-20250514": "claude-sonnet-4",
"auto": "auto",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

mapModelToKiro 函数的 modelMap 中,原生模型格式存在几个重复的键。例如,claude-opus-4.5 被列出了两次。

虽然这不会导致运行时错误(map 中一个键的最后一个条目会生效),但这使得代码更难阅读和维护。最好删除这些重复的条目以保持代码整洁。

Suggested change
"claude-opus-4-5": "claude-opus-4.5",
"claude-opus-4.5": "claude-opus-4.5",
"claude-haiku-4-5": "claude-haiku-4.5",
"claude-haiku-4.5": "claude-haiku-4.5",
"claude-sonnet-4-5": "claude-sonnet-4.5",
"claude-sonnet-4-5-20250929": "claude-sonnet-4.5",
"claude-sonnet-4.5": "claude-sonnet-4.5",
"claude-sonnet-4": "claude-sonnet-4",
"claude-sonnet-4-20250514": "claude-sonnet-4",
"auto": "auto",
"claude-opus-4-5": "claude-opus-4.5",
"claude-haiku-4-5": "claude-haiku-4.5",
"claude-sonnet-4-5": "claude-sonnet-4.5",
"claude-sonnet-4-5-20250929": "claude-sonnet-4.5",
"claude-sonnet-4": "claude-sonnet-4",
"claude-sonnet-4-20250514": "claude-sonnet-4",
"auto": "auto",

@luispater luispater merged commit 8dc690a into router-for-me:main Dec 12, 2025
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants