-
Notifications
You must be signed in to change notification settings - Fork 88
fix: format tool responses as raw text (no JSON stringify wrapper) #1003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughRefactors OpenAI provider tool-response handling to deliver tool outputs as plain multi-line text (status, toolName, error, output) instead of JSON-stringified payloads; adds docs for the format, updates utilities and provider serialization, and adjusts tests to expect the new textual format. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant OpenAIProvider
participant ToolResponseUtil
participant Model as OpenAI Model
rect rgb(245, 245, 255)
Note over Client,Model: Before — JSON-stringified payloads (over-escaped)
Client->>OpenAIProvider: returns tool result (string/object)
OpenAIProvider->>ToolResponseUtil: build payload + JSON.stringify(payload)
ToolResponseUtil-->>OpenAIProvider: "{\"status\":\"success\",\"output\":\"...\"}"
OpenAIProvider->>OpenAIProvider: ensureJsonSafe() (wrap/escape)
OpenAIProvider->>Model: escaped JSON string (double-encoded)
end
rect rgb(245, 255, 245)
Note over Client,Model: After — Plain multi-line text format
Client->>OpenAIProvider: returns tool result (string/object)
OpenAIProvider->>ToolResponseUtil: build payload
ToolResponseUtil-->>OpenAIProvider: "status: success\ntoolName: foo\noutput: ..."
OpenAIProvider->>OpenAIProvider: ensureJsonSafe() only if needed
OpenAIProvider->>Model: plain multi-line text (readable)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks✅ Passed checks (4 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Looking at PR #1003, I need to review the implementation of switching tool response format from JSON to plain text to address the over-escaping issue. LLxprt PR Review – PR #1003Issue Alignment
Side Effects
Code Quality
Tests & Coverage
Verdict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
docs/cli/providers-openai-responses.mddocs/cli/providers.mddocs/tool-output-format.mdpackages/core/src/providers/openai/OpenAIProvider.compressToolMessages.test.tspackages/core/src/providers/openai/OpenAIProvider.tspackages/core/src/providers/openai/buildResponsesRequest.stripToolCalls.test.tspackages/core/src/providers/openai/buildResponsesRequest.tspackages/core/src/providers/utils/toolResponsePayload.ts
🧰 Additional context used
📓 Path-based instructions (2)
**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Documentation-only changes (*.md files, docs/) do NOT require build/test/lint cycle
Files:
docs/cli/providers.mddocs/tool-output-format.mddocs/cli/providers-openai-responses.md
**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx}: Don't useany- Always specify proper types. Useunknownif the type is truly unknown and add proper type guards.
Do not useconsole.logorconsole.debug- Use the sophisticated logging system instead. Log files are written to ~/.llxprt/debug/
Fix all linting errors, including warnings aboutanytypes
Files:
packages/core/src/providers/openai/OpenAIProvider.tspackages/core/src/providers/openai/buildResponsesRequest.tspackages/core/src/providers/utils/toolResponsePayload.tspackages/core/src/providers/openai/OpenAIProvider.compressToolMessages.test.tspackages/core/src/providers/openai/buildResponsesRequest.stripToolCalls.test.ts
🧠 Learnings (6)
📓 Common learnings
Learnt from: e2720pjk
Repo: vybestack/llxprt-code PR: 583
File: packages/core/src/providers/openai/OpenAIProvider.ts:935-959
Timestamp: 2025-11-16T22:51:26.374Z
Learning: In the llxprt-code codebase (packages/core/src/providers/openai/OpenAIProvider.ts), tools like `run_shell_command` use internal streaming only for real-time UI updates during execution, but each tool execution produces exactly ONE final `ToolResponseBlock` containing the full result. The streaming chunks are never sent to the LLM and are not converted into multiple tool messages. The OpenAI Chat Completions API requires that each tool call (tool_call_id) corresponds to exactly one message with role 'tool', so duplicate tool response detection that removes subsequent tool messages with the same tool_call_id is correct and necessary for API compliance.
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-18T14:06:22.557Z
Learning: OpenAIResponsesProvider (packages/core/src/providers/openai-responses/OpenAIResponsesProvider.ts) currently bypasses the ephemeral truncation system by using direct `JSON.stringify(toolResponseBlock.result)` and needs to be updated to support ephemeral settings like the other providers.
📚 Learning: 2025-12-18T14:06:22.557Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-18T14:06:22.557Z
Learning: OpenAIResponsesProvider (packages/core/src/providers/openai-responses/OpenAIResponsesProvider.ts) currently bypasses the ephemeral truncation system by using direct `JSON.stringify(toolResponseBlock.result)` and needs to be updated to support ephemeral settings like the other providers.
Applied to files:
docs/cli/providers.mdpackages/core/src/providers/openai/OpenAIProvider.tspackages/core/src/providers/openai/buildResponsesRequest.tspackages/core/src/providers/utils/toolResponsePayload.tspackages/core/src/providers/openai/OpenAIProvider.compressToolMessages.test.tsdocs/cli/providers-openai-responses.mdpackages/core/src/providers/openai/buildResponsesRequest.stripToolCalls.test.ts
📚 Learning: 2025-11-16T22:51:26.374Z
Learnt from: e2720pjk
Repo: vybestack/llxprt-code PR: 583
File: packages/core/src/providers/openai/OpenAIProvider.ts:935-959
Timestamp: 2025-11-16T22:51:26.374Z
Learning: In the llxprt-code codebase (packages/core/src/providers/openai/OpenAIProvider.ts), tools like `run_shell_command` use internal streaming only for real-time UI updates during execution, but each tool execution produces exactly ONE final `ToolResponseBlock` containing the full result. The streaming chunks are never sent to the LLM and are not converted into multiple tool messages. The OpenAI Chat Completions API requires that each tool call (tool_call_id) corresponds to exactly one message with role 'tool', so duplicate tool response detection that removes subsequent tool messages with the same tool_call_id is correct and necessary for API compliance.
Applied to files:
docs/cli/providers.mddocs/tool-output-format.mdpackages/core/src/providers/openai/OpenAIProvider.tspackages/core/src/providers/openai/buildResponsesRequest.tspackages/core/src/providers/utils/toolResponsePayload.tspackages/core/src/providers/openai/OpenAIProvider.compressToolMessages.test.tsdocs/cli/providers-openai-responses.mdpackages/core/src/providers/openai/buildResponsesRequest.stripToolCalls.test.ts
📚 Learning: 2025-12-14T16:16:39.939Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-14T16:16:39.939Z
Learning: OpenAI's current model family (as of December 2025) is GPT-5, including gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-thinking, gpt-5-pro, and GPT-5.2 variants (Instant/Thinking/Pro). GPT-4.1 and GPT-4o are outdated.
Applied to files:
docs/cli/providers.md
📚 Learning: 2025-12-14T03:59:56.073Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-14T03:59:56.073Z
Learning: GLM-4.6 can be accessed via Z.ai using Anthropic provider with base-url https://api.z.ai/v1 and model zai-glm-4.6, in addition to Synthetic (hf:zai-org/GLM-4.6) and Cerebras (zai-glm-4.6).
Applied to files:
docs/cli/providers.md
📚 Learning: 2025-12-18T14:06:22.557Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-18T14:06:22.557Z
Learning: In the llxprt-code codebase, the `tool-output-max-items` ephemeral setting only applies to tools that return lists/collections (glob, grep, read-many-files), not to single-output tools like shell. Not all ephemerals are relevant to every tool.
Applied to files:
docs/tool-output-format.md
🧬 Code graph analysis (2)
packages/core/src/providers/openai/OpenAIProvider.ts (1)
packages/core/src/providers/utils/toolResponsePayload.ts (1)
formatToolResponseText(15-39)
packages/core/src/providers/openai/buildResponsesRequest.ts (1)
packages/core/src/providers/utils/toolResponsePayload.ts (2)
buildToolResponsePayload(148-178)formatToolResponseText(15-39)
🪛 markdownlint-cli2 (0.18.1)
docs/tool-output-format.md
11-11: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 270000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Test (ubuntu-latest, 24.x)
- GitHub Check: Test (windows-latest, 24.x)
- GitHub Check: Test (macos-latest, 24.x)
🔇 Additional comments (12)
docs/cli/providers-openai-responses.md (1)
324-324: LGTM!The addition effectively guides users to the tool output format documentation when troubleshooting tool call issues.
docs/cli/providers.md (1)
95-95: LGTM!The update accurately reflects the new tool output format and provides helpful context with a direct link to the detailed documentation.
packages/core/src/providers/openai/buildResponsesRequest.stripToolCalls.test.ts (1)
71-72: Test expectation correctly updated for new plain-text format.The expected output structure matches the
formatToolResponseTextimplementation with fields separated by newlines: status, toolName, error (empty in success case), and output containing the raw tool result.packages/core/src/providers/openai/OpenAIProvider.compressToolMessages.test.ts (2)
25-27: Test content updated to match new plain-text format.The test now uses the structured multi-line format produced by
formatToolResponseText, which correctly reflects the production behavior.
56-58: Truncation assertion appropriately simplified.Since the content is now plain text rather than JSON, checking for the
[truncatedsubstring is the correct approach to verify truncation behavior.packages/core/src/providers/openai/OpenAIProvider.ts (2)
65-68: Import changes look good.The new imports correctly bring in both
buildToolResponsePayloadfor constructing the payload andformatToolResponseTextfor serializing to plain text format.
1128-1141: Tool response content formatting correctly updated.The implementation properly:
- Builds the payload via
buildToolResponsePayload- Uses
formatToolResponseTextwith appropriate fallback fortoolName(payload.toolName ?? block.toolName)- Applies
ensureJsonSafefor Unicode sanitizationThis aligns with the PR objective of eliminating JSON encoding while preserving raw text structure.
packages/core/src/providers/openai/buildResponsesRequest.ts (3)
5-9: Imports correctly added for new payload utilities.The import statement properly brings in both functions needed for the new tool response formatting approach.
231-234: Good fix to preserve string parameters.Preserving string parameters as-is and only JSON-stringifying non-string parameters prevents double-encoding issues when tool call arguments are already serialized strings.
245-259: Tool response formatting correctly unified with OpenAIProvider.The implementation follows the same pattern as
OpenAIProvider.buildToolResponseContent:
- Build payload via
buildToolResponsePayload- Format with
formatToolResponseTextusing consistent field mapping- Apply Unicode sanitization conditionally
This ensures consistent tool output format across both Chat Completions and Responses APIs as stated in the PR objectives.
packages/core/src/providers/utils/toolResponsePayload.ts (2)
15-39: NewformatToolResponseTextutility is well-structured.The function creates a consistent multi-line format with clear section headers. Each field is separated by blank lines, making the output readable for models. The use of
??operators with empty strings ensures consistent output structure even when optional fields are absent.
91-101: Good enhancement to handle wrapped output objects.This change correctly handles tool results that arrive wrapped in an object with an
outputproperty (e.g.,{ output: "actual content" }). Extracting the string directly prevents unnecessary JSON serialization of the wrapper object, which aligns with the PR goal of preserving raw text structure.The fallback to
coerceToStringfor other object types ensures backward compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
docs/tool-output-format.md (1)
11-23: Consider adding example output and documenting truncation behavior.The format specification is clear, but adding a concrete example with actual values would help readers visualize the output. Additionally, the PR objectives mention that "truncation indicators now appear inline (e.g.,
[truncated])" — documenting this behavior and when it occurs would make the format more complete for users who encounter truncated outputs.🔎 Suggested addition: example and truncation note
You could add something like:
### Example ```text status: success toolName: run_shell_command error: output: file1.txt file2.txtTruncation
When tool output exceeds internal size limits, it is truncated with
[truncated]appended to indicate where the output was cut off.</details> </blockquote></details> </blockquote></details> <details> <summary>📜 Review details</summary> **Configuration used**: Organization UI **Review profile**: CHILL **Plan**: Pro <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 4d4692018e063cf7fe77272d6438c3823d7b456a and fdbb2b443e69f74ac87deebe42853a78dcb0479e. </details> <details> <summary>📒 Files selected for processing (1)</summary> * `docs/tool-output-format.md` </details> <details> <summary>🧰 Additional context used</summary> <details> <summary>📓 Path-based instructions (1)</summary> <details> <summary>**/*.md</summary> **📄 CodeRabbit inference engine (CLAUDE.md)** > Documentation-only changes (*.md files, docs/) do NOT require build/test/lint cycle Files: - `docs/tool-output-format.md` </details> </details><details> <summary>🧠 Learnings (2)</summary> <details> <summary>📓 Common learnings</summary>Learnt from: e2720pjk
Repo: vybestack/llxprt-code PR: 583
File: packages/core/src/providers/openai/OpenAIProvider.ts:935-959
Timestamp: 2025-11-16T22:51:26.374Z
Learning: In the llxprt-code codebase (packages/core/src/providers/openai/OpenAIProvider.ts), tools likerun_shell_commanduse internal streaming only for real-time UI updates during execution, but each tool execution produces exactly ONE finalToolResponseBlockcontaining the full result. The streaming chunks are never sent to the LLM and are not converted into multiple tool messages. The OpenAI Chat Completions API requires that each tool call (tool_call_id) corresponds to exactly one message with role 'tool', so duplicate tool response detection that removes subsequent tool messages with the same tool_call_id is correct and necessary for API compliance.Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-18T14:06:22.557Z
Learning: OpenAIResponsesProvider (packages/core/src/providers/openai-responses/OpenAIResponsesProvider.ts) currently bypasses the ephemeral truncation system by using directJSON.stringify(toolResponseBlock.result)and needs to be updated to support ephemeral settings like the other providers.</details> <details> <summary>📚 Learning: 2025-11-16T22:51:26.374Z</summary>Learnt from: e2720pjk
Repo: vybestack/llxprt-code PR: 583
File: packages/core/src/providers/openai/OpenAIProvider.ts:935-959
Timestamp: 2025-11-16T22:51:26.374Z
Learning: In the llxprt-code codebase (packages/core/src/providers/openai/OpenAIProvider.ts), tools likerun_shell_commanduse internal streaming only for real-time UI updates during execution, but each tool execution produces exactly ONE finalToolResponseBlockcontaining the full result. The streaming chunks are never sent to the LLM and are not converted into multiple tool messages. The OpenAI Chat Completions API requires that each tool call (tool_call_id) corresponds to exactly one message with role 'tool', so duplicate tool response detection that removes subsequent tool messages with the same tool_call_id is correct and necessary for API compliance.**Applied to files:** - `docs/tool-output-format.md` </details> </details> </details> <details> <summary>⏰ Context from checks skipped due to timeout of 270000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)</summary> * GitHub Check: Test (windows-latest, 24.x) * GitHub Check: Test (macos-latest, 24.x) * GitHub Check: Test (ubuntu-latest, 24.x) </details> <details> <summary>🔇 Additional comments (1)</summary><blockquote> <details> <summary>docs/tool-output-format.md (1)</summary><blockquote> `1-29`: **Clear documentation of the new plain-text format.** The documentation effectively explains the rationale (avoiding double/triple JSON encoding) and clearly specifies the four-field format structure. The code block now properly includes the `text` language identifier, resolving the prior linter concern. </blockquote></details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
Proposed change
Changes tool output format from JSON-stringified payloads to plain multi-line text for OpenAI models. Tool responses are now formatted as structured text blocks:
This eliminates double/triple JSON-encoding that produces heavily escaped strings, making tool outputs (especially code and logs) more readable for the model.
Implementation details
formatToolResponseText()utility that constructs the plain-text formatOpenAIProvider.formatToolMessage()to use the new format instead ofJSON.stringify()buildResponsesRequest.ts) to use the same format forfunction_call_outputformatToolResult()to handle object results withoutputpropertydocs/tool-output-format.mdexplaining the formatNotes
[truncated]) rather than as JSON metadataRelated issue
Fixes #1002 — this PR addresses OpenAI tool outputs being double/triple JSON-encoded (over-escaped), which significantly reduces model effectiveness when working with code and complex tool results.
Summary by CodeRabbit
Documentation
Refactor
Tests
✏️ Tip: You can customize this high-level summary in your review settings.