feat: enhance thinking mode support for Kiro translator #32

Ravens2121 · 2025-12-15T21:09:46Z

PR: Enhanced Thinking Mode Support - Header Detection & reasoning_content Output

Description

This PR enhances the Kiro translator's support for Claude thinking mode, including:

Header Detection: Support detecting interleaved-thinking-2025-05-14 identifier from Anthropic-Beta header to enable thinking mode
reasoning_content Output: Convert thinking content to OpenAI-compatible reasoning_content field instead of simply skipping it
System Prompt Optimization: Inject thinking instructions at the beginning of system prompt with fixed max_thinking_length: 200000
Duplicate Injection Prevention: Detect if thinking tags already exist in request body to avoid duplicate injection

Changes

1. `internal/runtime/executor/kiro_executor.go`

Modified buildKiroPayloadForFormat() function signature to add headers parameter
Support thinking mode detection, passing headers to translator
Added accumulatedThinkingContent variable for accumulating thinking content

2. `internal/translator/kiro/claude/kiro_claude_request.go`

New functions:

IsThinkingEnabledFromHeader(headers http.Header) bool - Detect thinking mode from Anthropic-Beta header
IsThinkingEnabledWithHeaders(req *ClaudeRequest, headers http.Header) bool - Combined detection function, integrating request body and header
hasThinkingTagInBody(req *ClaudeRequest) bool - Detect if thinking tags already exist in request body to prevent duplicate injection

Modifications:

Moved thinking prompt to the beginning of system prompt
Using fixed max_thinking_length: 200000

3. `internal/translator/kiro/claude/kiro_claude_stream.go`

New functions:

BuildClaudeThinkingBlockStopEvent() - Build thinking block stop event

4. `internal/translator/kiro/openai/kiro_openai.go`

Modifications:

Convert thinking block content to reasoning_content field instead of skipping
Use BuildOpenAIResponseWithReasoning() to build response with reasoning content

5. `internal/translator/kiro/openai/kiro_openai_request.go`

New functions:

checkThinkingModeFromOpenAIWithHeaders(req *OpenAIRequest, headers http.Header) bool - Support detecting thinking mode from header

Modifications:

Simplified thinking mode detection logic

6. `internal/translator/kiro/openai/kiro_openai_response.go`

New functions:

BuildOpenAIResponseWithReasoning(content, reasoningContent, model string) *OpenAIResponse - Build OpenAI response with reasoning_content field

7. `internal/translator/kiro/claude/kiro_claude_response.go`

New functions:

generateThinkingSignature() - Generate SHA256 signature for thinking content

Modifications:

ExtractThinkingFromContent() - Add signature field to all thinking blocks

Bug Fixes

Cherry Studio Non-Streaming Mode ZodError Fix

Problem: Cherry Studio reported ZodError validation error in non-streaming mode
Root Cause: Thinking blocks were missing the required signature field
Solution: Generate and add signature field to all thinking blocks using SHA256 hash

Feature Description

Thinking Mode Detection

Two ways to enable thinking mode are supported:

Header Method: Add Anthropic-Beta: interleaved-thinking-2025-05-14 header to the request
Request Body Method: Include thinking-related configuration in the request body

Thinking Instruction Injection

When thinking mode is detected as enabled, the system will inject the following content at the beginning of the system prompt:

<thinking_mode>interleaved</thinking_mode>
<max_thinking_length>200000</max_thinking_length>

reasoning_content Output

For OpenAI format responses, thinking content will be converted to the reasoning_content field:

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "Actual response content",
        "reasoning_content": "Thinking process content"
      }
    }
  ]
}

Testing Instructions

1. Claude Format Request Test

curl -X POST http://localhost:8080/v1/messages \
  -H "Content-Type: application/json" \
  -H "Anthropic-Beta: interleaved-thinking-2025-05-14" \
  -d '{
    "model": "claude-3-opus",
    "messages": [{"role": "user", "content": "Please explain the basic principles of quantum computing"}]
  }'

Verification points:

Response should contain thinking blocks
System prompt should contain thinking instructions at the beginning

2. OpenAI Format Request Test

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Anthropic-Beta: interleaved-thinking-2025-05-14" \
  -d '{
    "model": "claude-3-opus",
    "messages": [{"role": "user", "content": "Please explain the basic principles of quantum computing"}]
  }'

Verification points:

Response should contain reasoning_content field
content field should contain actual response content

3. Duplicate Injection Prevention Test

Send a request that already contains thinking tags to verify no duplicate injection occurs.

4. Streaming Response Test

Test thinking content handling in streaming responses using stream: true parameter.

Notes

Compatibility: This change is backward compatible and does not affect requests without thinking mode enabled
Performance: max_thinking_length: 200000 is a fixed value and may need adjustment based on actual usage
Header Priority: When both header and request body specify thinking mode, either one being enabled will take effect
Duplicate Prevention Mechanism: The system will detect if <thinking_mode> or <max_thinking_length> tags already exist in the request body to avoid duplicate injection

Related Issues

N/A

Checklist

Code has passed local testing
Code follows project coding standards
Necessary comments have been added
Related documentation has been updated

PR: 增强 Thinking 模式支持 - Header 检测与 reasoning_content 输出

描述

本 PR 增强了 Kiro 翻译器对 Claude thinking 模式的支持，主要包括：

Header 检测：支持从 Anthropic-Beta header 中检测 interleaved-thinking-2025-05-14 标识来启用 thinking 模式
reasoning_content 输出：将 thinking 内容转换为 OpenAI 兼容的 reasoning_content 字段，而非简单跳过
系统提示优化：将 thinking 指令注入到系统提示词开头，使用固定的 max_thinking_length: 200000
防重复注入：检测请求体中是否已存在 thinking 标签，避免重复注入

更改内容

1. `internal/runtime/executor/kiro_executor.go`

修改 buildKiroPayloadForFormat() 函数签名，添加 headers 参数
支持 thinking 模式检测，传递 headers 到翻译器
添加 accumulatedThinkingContent 变量用于累积思考内容

2. `internal/translator/kiro/claude/kiro_claude_request.go`

新增函数：

IsThinkingEnabledFromHeader(headers http.Header) bool - 从 Anthropic-Beta header 检测 thinking 模式
IsThinkingEnabledWithHeaders(req *ClaudeRequest, headers http.Header) bool - 综合检测函数，结合请求体和 header
hasThinkingTagInBody(req *ClaudeRequest) bool - 检测请求体中是否已存在 thinking 标签，防止重复注入

修改：

thinking 提示移至系统提示开头位置
使用固定的 max_thinking_length: 200000

3. `internal/translator/kiro/claude/kiro_claude_stream.go`

新增函数：

BuildClaudeThinkingBlockStopEvent() - 构建 thinking 块停止事件

4. `internal/translator/kiro/openai/kiro_openai.go`

修改：

thinking 块内容转换为 reasoning_content 字段，而非跳过
使用 BuildOpenAIResponseWithReasoning() 构建包含推理内容的响应

5. `internal/translator/kiro/openai/kiro_openai_request.go`

新增函数：

checkThinkingModeFromOpenAIWithHeaders(req *OpenAIRequest, headers http.Header) bool - 支持从 header 检测 thinking 模式

修改：

简化 thinking 模式检测逻辑

6. `internal/translator/kiro/openai/kiro_openai_response.go`

新增函数：

BuildOpenAIResponseWithReasoning(content, reasoningContent, model string) *OpenAIResponse - 构建包含 reasoning_content 字段的 OpenAI 响应

7. `internal/translator/kiro/claude/kiro_claude_response.go`

新增函数：

generateThinkingSignature() - 为 thinking 内容生成 SHA256 签名

修改：

ExtractThinkingFromContent() - 为所有 thinking 块添加 signature 字段

问题修复

Cherry Studio 非流模式 ZodError 修复

问题：Cherry Studio 在非流模式下报告 ZodError 验证错误
原因：thinking 块缺少必需的 signature 字段
解决方案：使用 SHA256 哈希为所有 thinking 块生成并添加 signature 字段

功能说明

Thinking 模式检测

支持两种方式启用 thinking 模式：

Header 方式：在请求中添加 Anthropic-Beta: interleaved-thinking-2025-05-14 header
请求体方式：在请求体中包含 thinking 相关配置

Thinking 指令注入

当检测到 thinking 模式启用时，系统会在系统提示词开头注入以下内容：

<thinking_mode>interleaved</thinking_mode>
<max_thinking_length>200000</max_thinking_length>

reasoning_content 输出

对于 OpenAI 格式的响应，thinking 内容会被转换为 reasoning_content 字段：

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "实际回复内容",
        "reasoning_content": "思考过程内容"
      }
    }
  ]
}

测试说明

1. Claude 格式请求测试

curl -X POST http://localhost:8080/v1/messages \
  -H "Content-Type: application/json" \
  -H "Anthropic-Beta: interleaved-thinking-2025-05-14" \
  -d '{
    "model": "claude-3-opus",
    "messages": [{"role": "user", "content": "请解释量子计算的基本原理"}]
  }'

验证点：

响应中应包含 thinking 块
系统提示词开头应包含 thinking 指令

2. OpenAI 格式请求测试

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Anthropic-Beta: interleaved-thinking-2025-05-14" \
  -d '{
    "model": "claude-3-opus",
    "messages": [{"role": "user", "content": "请解释量子计算的基本原理"}]
  }'

验证点：

响应中应包含 reasoning_content 字段
content 字段应包含实际回复内容

3. 防重复注入测试

发送已包含 thinking 标签的请求，验证不会重复注入。

4. 流式响应测试

使用 stream: true 参数测试流式响应中的 thinking 内容处理。

注意事项

兼容性：本更改向后兼容，不影响未启用 thinking 模式的请求
性能：max_thinking_length: 200000 是固定值，可能需要根据实际使用情况调整
Header 优先级：当 header 和请求体同时指定 thinking 模式时，任一启用即生效
防重复机制：系统会检测请求体中是否已存在 <thinking_mode> 或 <max_thinking_length> 标签，避免重复注入

Checklist

代码已通过本地测试
代码符合项目编码规范
已添加必要的注释
更新了相关文档

Changes:

gemini-code-assist · 2025-12-15T21:10:08Z

Summary of Changes

Hello @Ravens2121, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the Kiro translator's capability to handle Claude's thinking mode. It introduces a more robust and flexible mechanism for enabling this feature, allowing clients to activate it via HTTP headers in addition to existing body-based methods. Furthermore, it improves the output consistency for OpenAI format users by mapping the AI's internal thought processes to a dedicated reasoning_content field, making the reasoning accessible and standardized. The changes also refine how thinking instructions are injected into system prompts and include safeguards against duplicate injections, contributing to a more reliable and user-friendly experience when leveraging advanced AI reasoning features.

Highlights

Header-based Thinking Mode Detection: The Kiro translator now supports enabling Claude's thinking mode by detecting the 'interleaved-thinking-2025-05-14' identifier from the 'Anthropic-Beta' HTTP header, providing an alternative to body-based configuration.
OpenAI reasoning_content Output: Thinking content from Claude responses is now converted into an OpenAI-compatible reasoning_content field instead of being skipped, allowing OpenAI format users to access the AI's internal reasoning process.
System Prompt Optimization: Thinking instructions are now consistently injected at the beginning of the system prompt with a fixed max_thinking_length: 200000, streamlining the configuration for thinking mode.
Duplicate Injection Prevention: The system now detects if thinking tags already exist in the request body, preventing redundant injection of thinking instructions and ensuring clean prompt construction.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request effectively enhances the Kiro translator to support Claude's thinking mode by detecting it from headers and converting the output to the reasoning_content field for OpenAI compatibility. The changes are well-structured and cover both Claude and OpenAI translation paths.

My review has identified a few areas for improvement:

A critical logic bug where the duplicate injection prevention for thinking tags is implemented but not actually used.
Some code duplication that can be refactored for better maintainability, specifically a redundant function and a helper function defined in two places.

Addressing these points will make the implementation more robust and maintainable. Overall, great work on adding this feature.

gemini-code-assist · 2025-12-15T21:11:23Z

internal/translator/kiro/claude/kiro_claude_request.go

+	// Kiro API doesn't accept max_tokens for thinking. Instead, thinking mode is enabled
+	// by injecting <thinking_mode> and <max_thinking_length> tags into the system prompt.
+	// We use a fixed max_thinking_length value since Kiro handles the actual budget internally.
 	if thinkingEnabled {


The PR description mentions preventing duplicate injection of thinking tags, and the function hasThinkingTagInBody was added for this purpose. However, this check is not being used within BuildKiroPayload. As a result, the thinking hint will be injected even if the request body already contains <thinking_mode> or <max_thinking_length> tags.

Suggested change

if thinkingEnabled {

if thinkingEnabled && !hasThinkingTagInBody(claudeBody) {

gemini-code-assist · 2025-12-15T21:11:23Z

internal/translator/kiro/openai/kiro_openai_request.go

+	// Kiro API doesn't accept max_tokens for thinking. Instead, thinking mode is enabled
+	// by injecting <thinking_mode> and <max_thinking_length> tags into the system prompt.
+	// We use a fixed max_thinking_length value since Kiro handles the actual budget internally.
 	if thinkingEnabled {


Similar to the Claude translator, the duplicate injection prevention check hasThinkingTagInBody is implemented but not used in BuildKiroPayloadFromOpenAI. This will lead to duplicate thinking tags if the client already provides them.

Suggested change

if thinkingEnabled {

if thinkingEnabled && !hasThinkingTagInBody(openaiBody) {

gemini-code-assist · 2025-12-15T21:11:23Z

internal/translator/kiro/claude/kiro_claude_stream.go

+// BuildClaudeThinkingBlockStopEvent creates a content_block_stop SSE event for thinking blocks.
+func BuildClaudeThinkingBlockStopEvent(index int) []byte {
+	event := map[string]interface{}{
+		"type":  "content_block_stop",
+		"index": index,
+	}
+	result, _ := json.Marshal(event)
+	return []byte("event: content_block_stop\ndata: " + string(result))
+}


The new function BuildClaudeThinkingBlockStopEvent is identical to the existing function BuildClaudeContentBlockStopEvent. This introduces unnecessary code duplication. Please remove BuildClaudeThinkingBlockStopEvent and use BuildClaudeContentBlockStopEvent in its place in internal/runtime/executor/kiro_executor.go.

gemini-code-assist · 2025-12-15T21:11:23Z

internal/translator/kiro/openai/kiro_openai_request.go

+// hasThinkingTagInBody checks if the request body already contains thinking configuration tags.
+// This is used to prevent duplicate injection when client (e.g., AMP/Cursor) already includes thinking config.
+func hasThinkingTagInBody(body []byte) bool {
+	bodyStr := string(body)
+	return strings.Contains(bodyStr, "<thinking_mode>") || strings.Contains(bodyStr, "<max_thinking_length>")
 }


The function hasThinkingTagInBody is also defined in internal/translator/kiro/claude/kiro_claude_request.go. To improve maintainability and avoid code duplication, this utility function should be moved to a shared package, such as internal/translator/kiro/common, and then called from both kiro_claude_request.go and kiro_openai_request.go.

- Add generateThinkingSignature() function in kiro_claude_response.go

feat: enhance thinking mode support for Kiro translator

0a3a955

Changes:

gemini-code-assist bot reviewed Dec 15, 2025

View reviewed changes

fix: add signature field to thinking blocks for non-streaming mode

e889efe

- Add generateThinkingSignature() function in kiro_claude_response.go

Ravens2121 mentioned this pull request Dec 15, 2025

lack of thinking signature in kiro's non-stream response cause incompatibility with some ai clients (specifically cherry studio) #27

Closed

Ravens2121 added 2 commits December 16, 2025 05:32

chore: change debug logs from INFO to DEBUG level

f3d1cc8

Merge branch 'router-for-me:main' into master

894703a

luispater approved these changes Dec 16, 2025

View reviewed changes

luispater merged commit cd0b14d into router-for-me:main Dec 16, 2025
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: enhance thinking mode support for Kiro translator #32

feat: enhance thinking mode support for Kiro translator #32

Uh oh!

Ravens2121 commented Dec 15, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Dec 15, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 15, 2025

Uh oh!

gemini-code-assist bot Dec 15, 2025

Uh oh!

gemini-code-assist bot Dec 15, 2025

Uh oh!

gemini-code-assist bot Dec 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	if thinkingEnabled {
	if thinkingEnabled && !hasThinkingTagInBody(claudeBody) {

Uh oh!

feat: enhance thinking mode support for Kiro translator #32

feat: enhance thinking mode support for Kiro translator #32

Uh oh!

Conversation

Ravens2121 commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR: Enhanced Thinking Mode Support - Header Detection & reasoning_content Output

Description

Changes

1. internal/runtime/executor/kiro_executor.go

2. internal/translator/kiro/claude/kiro_claude_request.go

3. internal/translator/kiro/claude/kiro_claude_stream.go

4. internal/translator/kiro/openai/kiro_openai.go

5. internal/translator/kiro/openai/kiro_openai_request.go

6. internal/translator/kiro/openai/kiro_openai_response.go

7. internal/translator/kiro/claude/kiro_claude_response.go

Bug Fixes

Cherry Studio Non-Streaming Mode ZodError Fix

Feature Description

Thinking Mode Detection

Thinking Instruction Injection

reasoning_content Output

Testing Instructions

1. Claude Format Request Test

2. OpenAI Format Request Test

3. Duplicate Injection Prevention Test

4. Streaming Response Test

Notes

Related Issues

Checklist

PR: 增强 Thinking 模式支持 - Header 检测与 reasoning_content 输出

描述

更改内容

1. internal/runtime/executor/kiro_executor.go

2. internal/translator/kiro/claude/kiro_claude_request.go

3. internal/translator/kiro/claude/kiro_claude_stream.go

4. internal/translator/kiro/openai/kiro_openai.go

5. internal/translator/kiro/openai/kiro_openai_request.go

6. internal/translator/kiro/openai/kiro_openai_response.go

7. internal/translator/kiro/claude/kiro_claude_response.go

问题修复

Cherry Studio 非流模式 ZodError 修复

功能说明

Thinking 模式检测

Thinking 指令注入

reasoning_content 输出

测试说明

1. Claude 格式请求测试

2. OpenAI 格式请求测试

3. 防重复注入测试

4. 流式响应测试

注意事项

相关 Issue

Checklist

Uh oh!

gemini-code-assist bot commented Dec 15, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Ravens2121 commented Dec 15, 2025 •

edited

Loading

1. `internal/runtime/executor/kiro_executor.go`

2. `internal/translator/kiro/claude/kiro_claude_request.go`

3. `internal/translator/kiro/claude/kiro_claude_stream.go`

4. `internal/translator/kiro/openai/kiro_openai.go`

5. `internal/translator/kiro/openai/kiro_openai_request.go`

6. `internal/translator/kiro/openai/kiro_openai_response.go`

7. `internal/translator/kiro/claude/kiro_claude_response.go`

1. `internal/runtime/executor/kiro_executor.go`

2. `internal/translator/kiro/claude/kiro_claude_request.go`

3. `internal/translator/kiro/claude/kiro_claude_stream.go`

4. `internal/translator/kiro/openai/kiro_openai.go`

5. `internal/translator/kiro/openai/kiro_openai_request.go`

6. `internal/translator/kiro/openai/kiro_openai_response.go`

7. `internal/translator/kiro/claude/kiro_claude_response.go`