-
Notifications
You must be signed in to change notification settings - Fork 87
fix: resolve tool scheduling, cancellation, and search issues #961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- #945: Skip git grep for unsupported brace expansion patterns like *.{ts,tsx} Git grep pathspecs don't support shell-style brace expansion, so we now detect these patterns and fall back to system grep which handles them. - #948: Kill child process on abort in DiscoveredTool.execute The abort signal was being ignored (prefixed with underscore). Now we properly listen for abort and kill the spawned child process. - #951: Use global counter for history message IDs to prevent collisions Changed from per-instance useRef counter to module-level counter so different hook instances using the same timestamp won't collide. - #952/#957: Fix tool confirmation abort signal handling - Store original abort signal per callId in callIdToSignal map - Use stored signal in handleMessageBusResponse instead of creating new one - Add reentrancy guard to publishBufferedResults to prevent race conditions Fixes #945, #948, #951, #952, #957
WalkthroughReplaces per-instance history IDs with a module-level global counter; adds brace-expansion detection and abort handling to grep; propagates and stores per-call AbortSignals in CoreToolScheduler and reuses them on confirmations; adds reentrancy guard for buffered publishing; wires AbortSignal handling into DiscoveredTool.execute. Changes
Sequence Diagram(s)sequenceDiagram
participant UI as UI / Caller
participant Scheduler as CoreToolScheduler
participant SigMap as callIdToSignal (Map)
participant Tool as DiscoveredTool
participant Child as Spawned Child Process
UI->>Scheduler: schedule(callId, params, signal)
Scheduler->>SigMap: store(callId → signal)
Scheduler->>Tool: execute(params, signal)
Tool->>Child: spawn process
Tool->>Tool: attach abort listener -> on abort -> kill Child
alt Normal completion
Child-->>Tool: close (exitSignal)
Tool->>Scheduler: resolve with exitSignal
Scheduler->>SigMap: delete(callId)
Scheduler->>Scheduler: publishBufferedResults (ordered loop)
else Abort triggered
SigMap-->>Scheduler: provide stored signal for callId
Scheduler->>Tool: use original signal in confirmation handling
Tool->>Child: abort listener kills Child
Tool->>Scheduler: reject/resolve as aborted
Scheduler->>SigMap: delete(callId)
Scheduler->>Scheduler: publishBufferedResults (coalesced, guarded)
end
Note right of Scheduler: publishBufferedResults guarded by isPublishingBufferedResults and uses pendingPublishRequest to reschedule if new results arrive
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (2 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
I'll analyze the PR to review the fixes for tool scheduling, cancellation, and search issues. Let me start by examining the changes and verifying they align with the reported issues. I'll review PR #961 which addresses several critical issues with tool scheduling, cancellation, and search functionality. Let me analyze the changes: LLxprt PR Review – PR #961Issue Alignment
Side Effects
Code Quality
Tests & Coverage
Verdict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
project-plans/20251230fixes/issues/issue-951.json (1)
1-1: Data quality: Issue body contains excessive repetition.The issue body field contains the same paragraph repeated many times (appears to be a copy/paste artifact). While this doesn't affect code functionality, consider cleaning up this data file before merging if it will be used for documentation or reference purposes.
packages/core/src/core/coreToolScheduler.ts (1)
497-504: Add logging when AbortController fallback is used for diagnostics.The stored signal retrieval is correct. The fallback
new AbortController().signal(line 499) handles edge cases where the original signal was deleted (e.g., if the tool call completed while the confirmation response was in transit), but this should be rare. Add a debug log when the fallback is used to help diagnose unexpected cases:const originalSignal = this.callIdToSignal.get(callId); if (!originalSignal && toolSchedulerLogger.enabled) { toolSchedulerLogger.debug( () => `Using fallback AbortSignal for callId=${callId} (original signal not found in map)`, ); } const signal = originalSignal ?? new AbortController().signal;
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
packages/cli/src/ui/hooks/useHistoryManager.tspackages/core/src/core/coreToolScheduler.tspackages/core/src/tools/grep.tspackages/core/src/tools/tool-registry.tsproject-plans/20251230fixes/issues/issue-945.jsonproject-plans/20251230fixes/issues/issue-948.jsonproject-plans/20251230fixes/issues/issue-951.jsonproject-plans/20251230fixes/issues/issue-952.jsonproject-plans/20251230fixes/issues/issue-957.jsonproject-plans/20251230fixes/plan.md
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx}: Don't useany- Always specify proper types. Useunknownif the type is truly unknown and add proper type guards.
Do not useconsole.logorconsole.debug- Use the sophisticated logging system instead. Log files are written to ~/.llxprt/debug/
Fix all linting errors, including warnings aboutanytypes
Files:
packages/core/src/tools/tool-registry.tspackages/cli/src/ui/hooks/useHistoryManager.tspackages/core/src/tools/grep.tspackages/core/src/core/coreToolScheduler.ts
**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Documentation-only changes (*.md files, docs/) do NOT require build/test/lint cycle
Files:
project-plans/20251230fixes/plan.md
🧠 Learnings (5)
📓 Common learnings
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-26T14:17:41.974Z
Learning: In packages/core/src/core/coreToolScheduler.ts, the `publishBufferedResults()` method (line 1355) is called concurrently by multiple tool completion handlers (line 1510 in each tool's `.then()` callback) when parallel tools finish simultaneously. This causes a race condition where `nextPublishIndex` can be corrupted and tools can hang in 'executing' state forever, blocking the scheduler and causing subsequent tool calls to queue indefinitely. The method must be made reentrant-safe using a mutex/lock pattern to serialize concurrent calls.
📚 Learning: 2025-11-16T22:51:26.374Z
Learnt from: e2720pjk
Repo: vybestack/llxprt-code PR: 583
File: packages/core/src/providers/openai/OpenAIProvider.ts:935-959
Timestamp: 2025-11-16T22:51:26.374Z
Learning: In the llxprt-code codebase (packages/core/src/providers/openai/OpenAIProvider.ts), tools like `run_shell_command` use internal streaming only for real-time UI updates during execution, but each tool execution produces exactly ONE final `ToolResponseBlock` containing the full result. The streaming chunks are never sent to the LLM and are not converted into multiple tool messages. The OpenAI Chat Completions API requires that each tool call (tool_call_id) corresponds to exactly one message with role 'tool', so duplicate tool response detection that removes subsequent tool messages with the same tool_call_id is correct and necessary for API compliance.
Applied to files:
project-plans/20251230fixes/issues/issue-948.jsonproject-plans/20251230fixes/issues/issue-957.jsonproject-plans/20251230fixes/issues/issue-952.jsonpackages/core/src/core/coreToolScheduler.ts
📚 Learning: 2025-12-26T14:17:41.974Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-26T14:17:41.974Z
Learning: In packages/core/src/core/coreToolScheduler.ts, the `publishBufferedResults()` method (line 1355) is called concurrently by multiple tool completion handlers (line 1510 in each tool's `.then()` callback) when parallel tools finish simultaneously. This causes a race condition where `nextPublishIndex` can be corrupted and tools can hang in 'executing' state forever, blocking the scheduler and causing subsequent tool calls to queue indefinitely. The method must be made reentrant-safe using a mutex/lock pattern to serialize concurrent calls.
Applied to files:
project-plans/20251230fixes/issues/issue-948.jsonproject-plans/20251230fixes/issues/issue-957.jsonproject-plans/20251230fixes/issues/issue-952.jsonproject-plans/20251230fixes/plan.mdpackages/core/src/core/coreToolScheduler.ts
📚 Learning: 2025-12-18T14:06:22.557Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-18T14:06:22.557Z
Learning: OpenAIResponsesProvider (packages/core/src/providers/openai-responses/OpenAIResponsesProvider.ts) currently bypasses the ephemeral truncation system by using direct `JSON.stringify(toolResponseBlock.result)` and needs to be updated to support ephemeral settings like the other providers.
Applied to files:
project-plans/20251230fixes/issues/issue-952.json
📚 Learning: 2025-12-12T23:31:19.937Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 779
File: project-plans/20251212issue488/PLAN.md:1-7
Timestamp: 2025-12-12T23:31:19.937Z
Learning: Do not validate or review plan documents in the vybestack/llxprt-code repository after implementation. Specifically, skip validation for files under project-plans (e.g., project-plans/**/PLAN.md) once the feature is implemented; these documents are not checked in reviews post-implementation.
Applied to files:
project-plans/20251230fixes/plan.md
🪛 GitHub Check: CodeQL
packages/core/src/tools/grep.ts
[failure] 37-37: Polynomial regular expression used on uncontrolled data
This regular expression that depends on library input may run slow on strings starting with '{{' and with many repetitions of '{{'.
This regular expression that depends on library input may run slow on strings starting with '{{,' and with many repetitions of ','.
🪛 markdownlint-cli2 (0.18.1)
project-plans/20251230fixes/plan.md
190-190: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 270000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Test (macos-latest, 24.x)
- GitHub Check: Test (windows-latest, 24.x)
- GitHub Check: Test (ubuntu-latest, 24.x)
- GitHub Check: E2E Test (Linux) - sandbox:docker
- GitHub Check: E2E Test (Linux) - sandbox:none
- GitHub Check: Slow E2E - Win
- GitHub Check: E2E Test (macOS)
🔇 Additional comments (16)
packages/core/src/tools/grep.ts (1)
511-514: LGTM! Correct strategy selection based on brace expansion.The logic correctly detects brace expansion patterns and skips git grep (which doesn't support shell-style brace expansion like
{ts,tsx,js}), falling back to system grep or the JavaScript implementation which properly handle these patterns.project-plans/20251230fixes/issues/issue-952.json (1)
1-1: Issue documentation artifact - no code review needed.This JSON file documents the parallel tool call hanging issue (#952) and its analysis. The documented race condition in
publishBufferedResults()aligns with the fix implemented incoreToolScheduler.tsin this PR.packages/cli/src/ui/hooks/useHistoryManager.ts (3)
16-18: LGTM! Global counter ensures cross-hook ID uniqueness.Using a module-level counter prevents ID collisions when multiple
useHistoryhook instances generate IDs with the samebaseTimestamp. This correctly addresses issue #951's key collision problem.
69-74: Verify ID uniqueness assumptions for high-frequency operations.The formula
baseTimestamp * 1000 + globalMessageIdCounterworks well for typical usage. However, if more than ~1000 items are added within the same millisecond (samebaseTimestamp), IDs could theoretically collide with the next millisecond's range. This is unlikely in practice but worth documenting.Consider adding a brief comment noting the ~1000 items/ms assumption, or using a larger multiplier (e.g.,
1_000_000) if higher throughput is anticipated.
135-140: Correct: Counter not reset on clear to maintain session-wide uniqueness.The comment correctly explains the rationale. Resetting the counter would risk ID collisions if a user clears history and then adds new items with the same timestamp range.
packages/core/src/tools/tool-registry.ts (3)
87-93: LGTM! Proper abort signal handling for discovered tools.The abort handler correctly:
- Checks
!child.killedbefore attempting to kill (prevents errors on already-terminated processes)- Uses
SIGTERMfor graceful termination- Registers the handler with
addEventListenerThis matches the cancellation pattern used in
ShellExecutionService, fixing issue #948.
134-136: LGTM! Cleanup ensures no listener leaks.The
finallyblock guarantees the abort listener is removed regardless of how the promise resolves, preventing memory leaks and ensuring the handler doesn't fire after execution completes.
139-145: LGTM! Correct usage of exitSignal in error reporting.The
exitSignalvariable now correctly captures the actual signal received by the process (from thecloseevent), providing accurate diagnostics when the tool is terminated by a signal.project-plans/20251230fixes/issues/issue-945.json (1)
1-1: Issue documentation artifact - no code review needed.This JSON file documents issue #945 (search_file_content glob handling). The fix implemented in
grep.tswithhasBraceExpansion()directly addresses the root cause documented here.packages/core/src/core/coreToolScheduler.ts (4)
415-417: LGTM! Per-call signal tracking fixes abort propagation.Storing the abort signal per
callIdensures the original signal flows through to tool execution, even when confirmations are processed via the message bus. This directly fixes issue #948 where shell tools continued running after ESC.
1362-1407: Reentrancy guard correctly prevents race conditions.The
isPublishingBufferedResultsflag serializes concurrent calls topublishBufferedResults(), preventing the race condition wherenextPublishIndexcould be corrupted when multiple tools complete simultaneously. This aligns with the learning about this method needing to be reentrant-safe.The approach of silently returning when already publishing is acceptable because:
- The currently-executing call will process all buffered results up to the first gap
- When new results arrive, subsequent tool completions will trigger another attempt
- The
whileloop ensures all ready results are published in one passBased on learnings, this addresses the race condition in
publishBufferedResults().
948-949: LGTM! Signal stored during scheduling.Storing the signal when the tool call is scheduled ensures it's available later when processing message bus confirmation responses.
1572-1575: LGTM! Proper cleanup of signal mappings.Deleting entries from
callIdToSignalafter tool calls complete prevents memory leaks from accumulating stale signal references.project-plans/20251230fixes/issues/issue-948.json (1)
1-1: Issue documentation artifact - no code review needed.This JSON file documents issue #948 (shell tool cancellation). The detailed analysis correctly identified the root cause at line 494 where a new
AbortControllerwas created, which is now fixed via thecallIdToSignalmap incoreToolScheduler.ts.project-plans/20251230fixes/issues/issue-957.json (1)
1-1: Documentation file looks good.This JSON file documents the investigation and resolution of issue #957, providing valuable context for the fixes implemented in this PR. The detailed CodeRabbit analysis comments trace the debugging process and correctly identify the race condition in
publishBufferedResultsthat matches the implemented fix.project-plans/20251230fixes/plan.md (1)
1-200: Plan document provides clear roadmap for the fixes.This plan document effectively organizes the fixes for five related issues, with clear root cause analysis, proposed solutions, and implementation order. The structure helps track the work and provides good context for reviewers.
Based on learnings, plan documents serve as historical tracking and don't require detailed validation post-implementation.
| ``` | ||
| fix: resolve tool scheduling and cancellation issues | ||
| - #945: Skip git grep for unsupported brace expansion patterns | ||
| - #948: Kill child process on abort in DiscoveredTool.execute | ||
| - #951: Use global counter for history message IDs to prevent collisions | ||
| - #952/#957: Pass original abort signal through confirmation flow and | ||
| add reentrancy guard to publishBufferedResults | ||
| Fixes #945, #948, #951, #952, #957 | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add language identifier to the fenced code block.
The commit message template code block should specify a language for proper markdown rendering.
🔎 Proposed fix
-```
+```text
fix: resolve tool scheduling and cancellation issues
- #945: Skip git grep for unsupported brace expansion patterns
- #948: Kill child process on abort in DiscoveredTool.execute
- #951: Use global counter for history message IDs to prevent collisions
- #952/#957: Pass original abort signal through confirmation flow and
add reentrancy guard to publishBufferedResults
Fixes #945, #948, #951, #952, #957</details>
As per static analysis tools.
<!-- suggestion_start -->
<details>
<summary>📝 Committable suggestion</summary>
> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
```suggestion
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
190-190: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
project-plans/20251230fixes/plan.md lines 190-200: the fenced code block
containing the commit message template lacks a language identifier which can
hinder proper markdown rendering; update the opening fence to include a language
(e.g., ```text or ```md) so the block is rendered correctly and matches the
proposed fix shown in the comment.
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-24.x-ubuntu-latest' artifact from the main CI run. |
- Fix ReDoS vulnerability in hasBraceExpansion by using indexOf instead of regex (CodeQL) - Add abort signal handling to git grep and system grep child processes (#962) - Add debug logging when fallback AbortSignal is used (CodeRabbit) - Remove issue JSON files from project-plans (not needed in repo)
|
@coderabbitai I've addressed your feedback:
Additionally, I've added abort signal handling to git grep and system grep child processes (related to #962) so ESC actually cancels running searches now. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 1211 🏁 Script executed: Length of output: 3508 @acoliver Perfect! I've verified all your changes:
All feedback has been thoroughly addressed, and the additional abort signal handling is a great proactive improvement! 🎉 ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ 🧠 Learnings usedIf you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/core/src/tools/grep.ts (1)
314-314: Replace console logging with DebugLogger.Multiple
console.errorandconsole.debugcalls violate the coding guidelines. These should use the sophisticated logging system with DebugLogger instead, ensuring logs are written to~/.llxprt/debug/and respect logging configuration.🔎 Suggested approach
- Add a DebugLogger instance at the module level:
import { DebugLogger } from '../debug/index.js'; const grepLogger = new DebugLogger('llxprt:tools:grep');
- Replace console calls with logger calls:
// Line 314: Replace console.error(`Error during GrepLogic execution: ${error}`); // With grepLogger.error(() => `Error during GrepLogic execution: ${error}`); // Line 577, 691, 700, 763: Replace console.debug with grepLogger.debug(() => `message here`); // Line 778: Replace console.error with grepLogger.error(() => `message here`);As per coding guidelines, the logging system writes to
~/.llxprt/debug/and provides proper log management.Also applies to: 577-577, 691-691, 700-700, 763-763, 778-778
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/core/src/core/coreToolScheduler.tspackages/core/src/tools/grep.ts
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx}: Don't useany- Always specify proper types. Useunknownif the type is truly unknown and add proper type guards.
Do not useconsole.logorconsole.debug- Use the sophisticated logging system instead. Log files are written to ~/.llxprt/debug/
Fix all linting errors, including warnings aboutanytypes
Files:
packages/core/src/core/coreToolScheduler.tspackages/core/src/tools/grep.ts
🧠 Learnings (3)
📓 Common learnings
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-26T14:17:41.974Z
Learning: In packages/core/src/core/coreToolScheduler.ts, the `publishBufferedResults()` method (line 1355) is called concurrently by multiple tool completion handlers (line 1510 in each tool's `.then()` callback) when parallel tools finish simultaneously. This causes a race condition where `nextPublishIndex` can be corrupted and tools can hang in 'executing' state forever, blocking the scheduler and causing subsequent tool calls to queue indefinitely. The method must be made reentrant-safe using a mutex/lock pattern to serialize concurrent calls.
📚 Learning: 2025-12-26T14:17:41.974Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-26T14:17:41.974Z
Learning: In packages/core/src/core/coreToolScheduler.ts, the `publishBufferedResults()` method (line 1355) is called concurrently by multiple tool completion handlers (line 1510 in each tool's `.then()` callback) when parallel tools finish simultaneously. This causes a race condition where `nextPublishIndex` can be corrupted and tools can hang in 'executing' state forever, blocking the scheduler and causing subsequent tool calls to queue indefinitely. The method must be made reentrant-safe using a mutex/lock pattern to serialize concurrent calls.
Applied to files:
packages/core/src/core/coreToolScheduler.ts
📚 Learning: 2025-11-16T22:51:26.374Z
Learnt from: e2720pjk
Repo: vybestack/llxprt-code PR: 583
File: packages/core/src/providers/openai/OpenAIProvider.ts:935-959
Timestamp: 2025-11-16T22:51:26.374Z
Learning: In the llxprt-code codebase (packages/core/src/providers/openai/OpenAIProvider.ts), tools like `run_shell_command` use internal streaming only for real-time UI updates during execution, but each tool execution produces exactly ONE final `ToolResponseBlock` containing the full result. The streaming chunks are never sent to the LLM and are not converted into multiple tool messages. The OpenAI Chat Completions API requires that each tool call (tool_call_id) corresponds to exactly one message with role 'tool', so duplicate tool response detection that removes subsequent tool messages with the same tool_call_id is correct and necessary for API compliance.
Applied to files:
packages/core/src/core/coreToolScheduler.ts
⏰ Context from checks skipped due to timeout of 270000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test (windows-latest, 24.x)
- GitHub Check: Test (macos-latest, 24.x)
- GitHub Check: Test (ubuntu-latest, 24.x)
- GitHub Check: E2E Test (Linux) - sandbox:docker
- GitHub Check: E2E Test (Linux) - sandbox:none
- GitHub Check: Slow E2E - Win
🔇 Additional comments (9)
packages/core/src/core/coreToolScheduler.ts (5)
415-417: LGTM: Clear signal tracking infrastructure.The
callIdToSignalmap provides the necessary infrastructure to preserve per-call abort signals for later use in confirmation handling, addressing the root cause described in the PR objectives.
954-955: LGTM: Signal stored at the right time.Storing the signal during the validation phase ensures it's available throughout the tool's lifecycle, including when handling message bus confirmation responses.
1368-1370: LGTM: Reentrancy guard addresses known race condition.This guard prevents the concurrent execution issue where multiple tool completion handlers could corrupt
nextPublishIndex, causing tools to hang indefinitely. Based on learnings, this fix directly addresses the race condition when parallel tools finish simultaneously.
1372-1414: LGTM: Robust reentrancy protection with proper cleanup.The try/finally pattern ensures the guard is always cleared, even if an exception occurs during publishing. The early return on line 1375 serializes concurrent calls, preventing the race condition. The in-order publishing logic remains intact.
1578-1582: LGTM: Proper cleanup prevents memory leaks.Cleaning up signal mappings when tool calls complete prevents unbounded growth of the
callIdToSignalmap over time.packages/core/src/tools/grep.ts (4)
37-44: LGTM: ReDoS vulnerability correctly addressed.The indexOf-based approach provides O(n) complexity and avoids the polynomial time behavior of the previous regex pattern. The logic correctly detects brace expansion patterns like
{ts,tsx,js}.
517-520: LGTM: Correctly skips git grep for incompatible patterns.The brace expansion check prevents using git grep with patterns like
*.{ts,tsx}that would fail or produce incorrect results due to git pathspec limitations. This properly addresses issue #945.
546-562: LGTM: Proper abort handling for git grep.The abort handler correctly:
- Kills the child process on cancellation
- Removes the listener in all completion paths (error and close)
- Rejects with a clear error message
This addresses issue #948 by ensuring git grep processes are terminated when aborted.
627-673: LGTM: Comprehensive abort handling for system grep.The implementation properly handles cancellation:
- Abort handler kills the child and invokes cleanup
- The cleanup function consistently removes all event listeners including the abort listener
- Cleanup is called in all termination paths (abort, error, close)
This addresses issue #948 by ensuring system grep processes are properly terminated and resources are cleaned up.
- Handle schedule() promise rejection when cancelling queued tool calls - Use pre-aborted fallback signal when original signal is missing - Fix publishBufferedResults reentrancy to prevent parallel tool hangs Addresses CodeRabbit feedback on PR #961
The publishBufferedResults method was hanging when parallel tool calls
completed out of order. The root cause was filtering toolCalls by status
('scheduled' or 'executing'), but publishResult() changes status to
'success', causing tools to disappear from the filtered list mid-iteration.
Fix:
- Store currentBatchSize when execution starts (immutable during batch)
- Use executionIndex stored in buffered results for ordering instead of
relying on filtered array positions
- Add safety net in finally block to reschedule if pending results remain
This approach is more robust because the batch size and execution indices
are fixed at start and don't change as tools complete.
Added tests for race condition scenarios with out-of-order completions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
packages/core/src/core/coreToolScheduler.test.ts (1)
8-18: DuplicateToolCallimport may cause confusion.
ToolCallis imported both as a type (line 9) and as a value (line 15). While TypeScript allows this when the value export is a class/const with the same name as a type, this redundancy is confusing. IfToolCallis exported as both a type and a value fromcoreToolScheduler.js, consider using only the value import (which TypeScript can also use as a type).🔎 Suggested simplification
-import type { - ToolCall, - WaitingToolCall, - CompletedToolCall, -} from './coreToolScheduler.js'; -import { - CoreToolScheduler, - ToolCall, - WaitingToolCall, - convertToFunctionResponse, -} from './coreToolScheduler.js'; +import { + CoreToolScheduler, + type ToolCall, + type WaitingToolCall, + type CompletedToolCall, + convertToFunctionResponse, +} from './coreToolScheduler.js';
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/core/src/core/coreToolScheduler.test.tspackages/core/src/core/coreToolScheduler.ts
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx}: Don't useany- Always specify proper types. Useunknownif the type is truly unknown and add proper type guards.
Do not useconsole.logorconsole.debug- Use the sophisticated logging system instead. Log files are written to ~/.llxprt/debug/
Fix all linting errors, including warnings aboutanytypes
Files:
packages/core/src/core/coreToolScheduler.tspackages/core/src/core/coreToolScheduler.test.ts
🧠 Learnings (3)
📓 Common learnings
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-26T14:17:41.974Z
Learning: In packages/core/src/core/coreToolScheduler.ts, the `publishBufferedResults()` method (line 1355) is called concurrently by multiple tool completion handlers (line 1510 in each tool's `.then()` callback) when parallel tools finish simultaneously. This causes a race condition where `nextPublishIndex` can be corrupted and tools can hang in 'executing' state forever, blocking the scheduler and causing subsequent tool calls to queue indefinitely. The method must be made reentrant-safe using a mutex/lock pattern to serialize concurrent calls.
📚 Learning: 2025-12-26T14:17:41.974Z
Learnt from: acoliver
Repo: vybestack/llxprt-code PR: 0
File: :0-0
Timestamp: 2025-12-26T14:17:41.974Z
Learning: In packages/core/src/core/coreToolScheduler.ts, the `publishBufferedResults()` method (line 1355) is called concurrently by multiple tool completion handlers (line 1510 in each tool's `.then()` callback) when parallel tools finish simultaneously. This causes a race condition where `nextPublishIndex` can be corrupted and tools can hang in 'executing' state forever, blocking the scheduler and causing subsequent tool calls to queue indefinitely. The method must be made reentrant-safe using a mutex/lock pattern to serialize concurrent calls.
Applied to files:
packages/core/src/core/coreToolScheduler.tspackages/core/src/core/coreToolScheduler.test.ts
📚 Learning: 2025-11-16T22:51:26.374Z
Learnt from: e2720pjk
Repo: vybestack/llxprt-code PR: 583
File: packages/core/src/providers/openai/OpenAIProvider.ts:935-959
Timestamp: 2025-11-16T22:51:26.374Z
Learning: In the llxprt-code codebase (packages/core/src/providers/openai/OpenAIProvider.ts), tools like `run_shell_command` use internal streaming only for real-time UI updates during execution, but each tool execution produces exactly ONE final `ToolResponseBlock` containing the full result. The streaming chunks are never sent to the LLM and are not converted into multiple tool messages. The OpenAI Chat Completions API requires that each tool call (tool_call_id) corresponds to exactly one message with role 'tool', so duplicate tool response detection that removes subsequent tool messages with the same tool_call_id is correct and necessary for API compliance.
Applied to files:
packages/core/src/core/coreToolScheduler.tspackages/core/src/core/coreToolScheduler.test.ts
⏰ Context from checks skipped due to timeout of 270000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Test (windows-latest, 24.x)
- GitHub Check: Test (ubuntu-latest, 24.x)
- GitHub Check: Test (macos-latest, 24.x)
- GitHub Check: E2E Test (Linux) - sandbox:none
- GitHub Check: E2E Test (Linux) - sandbox:docker
- GitHub Check: E2E Test (macOS)
- GitHub Check: Slow E2E - Win
🔇 Additional comments (9)
packages/core/src/core/coreToolScheduler.test.ts (2)
2258-2396: Well-designed race condition test.This test effectively exercises the race condition scenario documented in the learning. The deferred promise pattern with explicit resolver control provides precise timing to trigger the edge case where
publishBufferedResultsmight exit prematurely without processing all buffered results.
2398-2496: Good edge case coverage for blocked publishing.This test validates the scenario where all later tools complete before the first one, ensuring the buffering mechanism correctly holds and publishes results in request order. The timing setup (80ms for tool 1, staggered 5ms increments for others) reliably reproduces the scenario.
packages/core/src/core/coreToolScheduler.ts (7)
415-417: Per-call signal tracking implemented correctly.The
callIdToSignalmap properly stores the original AbortSignal per tool call, enabling correct cancellation semantics when handling message bus confirmation responses asynchronously.
497-520: Pre-aborted fallback signal correctly implemented.When the original signal is missing from
callIdToSignal, the code now correctly creates a pre-aborted signal rather than a never-aborted one. This ensures that confirmation handlers won't erroneously proceed when the tool call has already been cleaned up.
964-965: Signal stored at correct point in scheduling flow.The signal is stored in
callIdToSignalearly in the scheduling process, before any async operations, ensuring it's available for message bus confirmation handling.
1378-1467: Robust reentrancy guard with safety net.The implementation correctly addresses the race condition from the learning:
- The
isPublishingBufferedResultsflag prevents concurrent corruption- The
pendingPublishRequestflag signals when another publish is needed- The do-while loop handles results arriving during publishing
- The finally block safety net (lines 1460-1466) handles the edge case where results arrive between checking
pendingPublishRequestand exitingUsing
setImmediatefor rescheduling is appropriate to prevent stack overflow from deep recursion. Based on learnings, this addresses thepublishBufferedResults()race condition that caused tools to hang in 'executing' state.
1543-1545: Batch size captured at correct point.Setting
currentBatchSizeat execution start (before any tools complete) ensures the publisher knows exactly how many results to expect, regardless of how tool statuses change during execution.
1636-1640: Signal cleanup prevents memory leaks.The cleanup of
callIdToSignalentries happens after all tool calls reach terminal state, ensuring signals aren't deleted prematurely while confirmation responses might still arrive.
1711-1780:cancelAllcorrectly delegates signal cleanup.The
cancelAll()method transitions all tool calls to cancelled status, then callscheckAndNotifyCompletion()which handles thecallIdToSignalcleanup when all calls are terminal.
Summary
This PR fixes five related issues affecting tool execution, cancellation, and search functionality.
Issues Fixed
#945 - search_file_content glob include issues
Problem: Brace expansion patterns like
*.{ts,tsx,js}don't work with git grep, causing silent failures or fallback to slow JS glob scan.Solution: Added
hasBraceExpansion()helper to detect patterns with{...}syntax. When detected, we skip git grep entirely and use system grep (which supports--includeglobs with brace expansion) or the JS fallback.#948 - Shell cancellation doesn't kill the process
Problem: ESC shows "cancelled" but the shell process continues running. The
_signalparameter inDiscoveredTool.execute()was being ignored (underscore prefix indicated unused).Solution: Added proper abort signal handling that kills the spawned child process when abort is triggered, following the same pattern used in
ShellExecutionService.#951 - History ID uniqueness/key collisions
Problem: React keys could collide when multiple
useHistoryhook instances generated IDs using the samebaseTimestampbecause the counter was per-instance (useRef).Solution: Changed to a module-level global counter so IDs are unique across all hook instances. Also multiplied timestamp by 1000 to provide more ID space.
#952 & #957 - Tool read hangs / "Tool call cancelled while in queue" error
Problem: Two related issues stemming from the same root cause:
handleMessageBusResponsecreated a NEWAbortControllerinstead of using the original signal from the scheduled requestpublishBufferedResultscould race when called from multiple async tool completionsSolution:
callIdToSignalmap to store the original signal for each tool callisPublishingBufferedResults) to prevent race conditionsTesting
npm run format- passednpm run lint- passednpm run typecheck- passednpm run test- all tests passnpm run build- passednode scripts/start.js --profile-load synthetic --prompt "write me a haiku"Related
Fixes #945, #948, #951, #952, #957
Summary by CodeRabbit
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.