Skip to content

Commit f501137

Browse files
committed
Add integration tests for streaming with tool calls and enhance UI for tool execution feedback
- Implemented `StreamingWithToolCallsTest` to verify the lifecycle of tool execution during streaming, including event emissions and result handling. - Enhanced `ChatMessages` component to display tool execution progress with visual indicators for started, completed, and failed states. - Updated `ChatStreamingLogic` to track tool execution events and reasoning during streaming. - Added CSS for rotating icons to indicate ongoing tool execution. - Extended TypeScript types to include tool execution tracking in chat messages. - Updated SDK to handle new SSE event types for tool execution and reasoning. - Created documentation for streaming with tool calls, detailing architecture, event types, and usage examples.
1 parent 5d09cbf commit f501137

20 files changed

Lines changed: 2176 additions & 49 deletions

File tree

README.md

Lines changed: 57 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -45,11 +45,15 @@ npm install @knn_labs/conduit-admin-client
4545

4646
## Key Features
4747

48-
- **OpenAI-Compatible REST API**: Exposes a standard `/v1/chat/completions` endpoint for seamless integration with existing tools and SDKs
48+
- **OpenAI-Compatible REST API**:
49+
-**100% OpenAI compatible** - drop-in replacement for OpenAI API clients
50+
-**Extended with Conduit features** - optional enhanced events for reasoning, tool execution, and metrics
51+
-**Works with standard clients** - OpenAI SDKs and tools work without any modifications
52+
- 📚 For enhanced features, use Conduit SDKs to access real-time tool execution, reasoning events, and performance metrics
4953
- **Multi-Provider Support**: Interact with various LLM providers through a single interface
5054
- **Model Routing & Mapping**: Define custom model aliases (e.g., `my-gpt4`) and map them to specific provider models (e.g., `openai/gpt-4`)
5155
- **Virtual API Key Management**: Create and manage Conduit-specific API keys (`condt_...`) with built-in spend tracking
52-
- **Streaming Support**: Real-time token streaming for responsive applications
56+
- **Streaming Support**: Real-time token streaming with optional enhanced events (reasoning, tool execution progress, metrics)
5357
- **Web-Based User Interface**: Administrative dashboard for configuration and monitoring
5458
- **Enterprise Security Features**: IP filtering, rate limiting, failed login protection, and security headers
5559
- **Security Dashboard**: Real-time monitoring of security events and access attempts
@@ -381,15 +385,17 @@ curl http://localhost:5000/v1/chat/completions \
381385

382386
### Using with OpenAI SDKs
383387

388+
Conduit is **100% compatible with standard OpenAI SDKs** - simply point them to your Conduit instance:
389+
384390
```python
385-
# Python example
391+
# Python example with OpenAI SDK (fully compatible)
386392
from openai import OpenAI
387393
388394
client = OpenAI(
389395
api_key="condt_yourvirtualkey",
390396
# Use http://localhost:5000/v1 for local testing,
391397
# or your configured CONDUIT_API_BASE_URL for deployed instances
392-
base_url="http://localhost:5000/v1"
398+
base_url="http://localhost:5000/v1"
393399
)
394400
395401
response = client.chat.completions.create(
@@ -398,6 +404,53 @@ response = client.chat.completions.create(
398404
)
399405
```
400406

407+
#### Enhanced Features with Conduit SDKs
408+
409+
For access to Conduit-specific features like real-time tool execution progress, reasoning events, and performance metrics, use the official Conduit SDKs:
410+
411+
```typescript
412+
// Node.js/TypeScript example with Conduit SDK
413+
import { ConduitCoreClient } from '@knn_labs/conduit-core-client';
414+
import {
415+
isChatCompletionChunk,
416+
isToolExecutingEvent,
417+
isFinalMetrics
418+
} from '@knn_labs/conduit-core-client';
419+
420+
const client = new ConduitCoreClient({
421+
apiKey: 'condt_yourvirtualkey',
422+
baseURL: 'http://localhost:5000'
423+
});
424+
425+
const stream = await client.chat.create({
426+
model: 'gpt-4',
427+
messages: [{ role: 'user', content: 'What is the weather?' }],
428+
stream: true,
429+
function_configuration_ids: ['weather-functions']
430+
});
431+
432+
for await (const event of stream) {
433+
if (isChatCompletionChunk(event)) {
434+
// Standard OpenAI content
435+
const content = event.choices[0]?.delta?.content;
436+
}
437+
else if (isToolExecutingEvent(event)) {
438+
// Conduit extension: real-time tool execution
439+
console.log(`Executing ${event.function_name}...`);
440+
}
441+
else if (isFinalMetrics(event)) {
442+
// Conduit extension: performance metrics
443+
console.log(`Tokens: ${event.total_tokens}, Speed: ${event.tokens_per_second}`);
444+
}
445+
}
446+
```
447+
448+
**Key Differences:**
449+
- **OpenAI SDKs**: ✅ Full compatibility, ignores Conduit extensions
450+
- **Conduit SDKs**: ✅ Full compatibility + enhanced events (reasoning, tool execution, metrics)
451+
452+
See [Streaming with Tools Guide](docs/api-guides/streaming-with-tools.md) for complete documentation.
453+
401454

402455
## Documentation
403456

SDKs/Node/Core/README.md

Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,175 @@ for await (const chunk of stream) {
9393
}
9494
```
9595

96+
### Streaming with Function Calling
97+
98+
Conduit extends the OpenAI streaming API with additional event types for richer real-time experiences. While maintaining full OpenAI compatibility, Conduit streams include:
99+
100+
- 🧠 **Reasoning events** - Model thinking/reasoning content
101+
- 🔧 **Tool execution events** - Real-time function call progress
102+
- 📊 **Performance metrics** - Live and final metrics
103+
104+
#### Enhanced Event Types
105+
106+
```typescript
107+
import {
108+
isChatCompletionChunk,
109+
isFinalMetrics,
110+
isReasoningEvent,
111+
isToolExecutingEvent,
112+
isStreamingMetrics
113+
} from '@conduit/core';
114+
115+
const stream = await client.chat.completions.create({
116+
model: 'gpt-4',
117+
messages: [{ role: 'user', content: 'What is the weather in Paris?' }],
118+
stream: true,
119+
function_configuration_ids: ['weather-functions'], // Conduit managed functions
120+
});
121+
122+
let totalContent = '';
123+
let totalReasoning = '';
124+
const toolCalls = [];
125+
126+
for await (const event of stream) {
127+
// Standard OpenAI chat chunks
128+
if (isChatCompletionChunk(event)) {
129+
const content = event.choices?.[0]?.delta?.content;
130+
if (content) {
131+
totalContent += content;
132+
process.stdout.write(content);
133+
}
134+
135+
// Handle streaming tool calls
136+
const deltaToolCalls = event.choices?.[0]?.delta?.tool_calls;
137+
if (deltaToolCalls) {
138+
for (const toolCall of deltaToolCalls) {
139+
const index = toolCall.index ?? 0;
140+
if (!toolCalls[index]) {
141+
toolCalls[index] = {
142+
id: toolCall.id ?? '',
143+
type: 'function',
144+
function: {
145+
name: toolCall.function?.name ?? '',
146+
arguments: toolCall.function?.arguments ?? ''
147+
}
148+
};
149+
} else {
150+
// Append arguments incrementally
151+
if (toolCall.function?.arguments) {
152+
toolCalls[index].function.arguments += toolCall.function.arguments;
153+
}
154+
}
155+
}
156+
}
157+
158+
// Check finish_reason
159+
const finishReason = event.choices?.[0]?.finish_reason;
160+
if (finishReason === 'tool_calls') {
161+
console.warn('\nExecuting tools...');
162+
// DO NOT end the stream! Backend will execute tools and continue
163+
} else if (finishReason === 'stop') {
164+
console.warn('\nStream complete');
165+
}
166+
}
167+
168+
// Conduit extension: Reasoning events (model thinking)
169+
else if (isReasoningEvent(event)) {
170+
totalReasoning += event.content;
171+
console.warn(`[Reasoning] ${event.content}`);
172+
}
173+
174+
// Conduit extension: Tool execution progress
175+
else if (isToolExecutingEvent(event)) {
176+
if (event.status === 'started') {
177+
console.warn(`\n🔧 Executing ${event.function_name}...`);
178+
} else if (event.status === 'completed') {
179+
console.warn(`✅ ${event.function_name} completed`);
180+
console.warn(` Result: ${JSON.stringify(event.result)}`);
181+
console.warn(` Cost: $${event.cost}`);
182+
} else if (event.status === 'failed') {
183+
console.warn(`❌ ${event.function_name} failed: ${event.error_message}`);
184+
}
185+
}
186+
187+
// Conduit extension: Live performance metrics
188+
else if (isStreamingMetrics(event)) {
189+
console.warn(`Speed: ${event.current_tokens_per_second} tokens/sec`);
190+
}
191+
192+
// Conduit extension: Final metrics
193+
else if (isFinalMetrics(event)) {
194+
console.warn('\nFinal Metrics:');
195+
console.warn(` Total tokens: ${event.total_tokens}`);
196+
console.warn(` Latency: ${event.total_latency_ms}ms`);
197+
console.warn(` Speed: ${event.tokens_per_second} tokens/sec`);
198+
console.warn(` Provider: ${event.provider}`);
199+
}
200+
}
201+
202+
// Use reasoning as fallback if no content (some models output to reasoning)
203+
const finalContent = totalContent || totalReasoning;
204+
console.warn('\n\nFinal content:', finalContent);
205+
console.warn('Tool calls:', toolCalls);
206+
```
207+
208+
#### Important: finish_reason Semantics
209+
210+
When streaming with function calling, `finish_reason` has special semantics:
211+
212+
- **`finish_reason: "tool_calls"`** - Tool execution in progress, stream **continues**
213+
- **`finish_reason: "stop"`** - Actual completion, stream ends
214+
- **`finish_reason: "length"`** - Max tokens reached, stream ends
215+
216+
**Critical**: Do NOT end the stream when `finish_reason === "tool_calls"`. The backend executes tools and continues streaming the model's response with tool results.
217+
218+
```typescript
219+
// ✅ Correct handling
220+
if (finishReason === 'tool_calls') {
221+
// Tools executing, keep processing stream
222+
continue;
223+
}
224+
225+
if (finishReason === 'stop' || finishReason === 'length') {
226+
// Actual completion
227+
break;
228+
}
229+
230+
// ❌ Wrong - ends too early!
231+
if (finishReason) {
232+
break; // This breaks on "tool_calls" prematurely
233+
}
234+
```
235+
236+
#### OpenAI Compatibility
237+
238+
Standard OpenAI clients can consume Conduit streams by ignoring Conduit extensions:
239+
240+
```typescript
241+
import OpenAI from 'openai';
242+
243+
const openai = new OpenAI({
244+
baseURL: 'https://your-conduit-instance.com/v1',
245+
apiKey: 'your-virtual-key'
246+
});
247+
248+
const stream = await openai.chat.completions.create({
249+
model: 'gpt-4',
250+
messages: [{ role: 'user', content: 'Hello!' }],
251+
stream: true,
252+
tools: [{ type: 'function', function: { name: 'get_weather', ...} }]
253+
});
254+
255+
for await (const chunk of stream) {
256+
// Works exactly like OpenAI API
257+
// Conduit extensions (reasoning, tool-executing, metrics) are ignored
258+
const content = chunk.choices[0]?.delta?.content;
259+
if (content) process.stdout.write(content);
260+
}
261+
```
262+
263+
For more details, see the [Streaming with Tools API Guide](../../docs/api-guides/streaming-with-tools.md).
264+
96265
### React Query Hooks - Streaming
97266

98267
The React Query integration now supports proper streaming with callbacks:

SDKs/Node/Core/src/chat/streaming/chat-streaming-manager.ts

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -368,6 +368,46 @@ export class ChatStreamingManager {
368368
break;
369369
}
370370

371+
case SSEEventType.Reasoning: {
372+
const reasoningData = event.data as { content?: string };
373+
const reasoning = reasoningData?.content;
374+
375+
if (reasoning !== undefined && reasoning !== null && reasoning !== '') {
376+
this.state.totalReasoning += reasoning;
377+
callbacks.onReasoning?.(reasoning, this.state.totalReasoning);
378+
this.log('Reasoning content received:', reasoning.slice(0, 100));
379+
}
380+
break;
381+
}
382+
383+
case SSEEventType.ToolExecuting: {
384+
const toolData = event.data as {
385+
tool_call_id?: string;
386+
function_name?: string;
387+
status: string;
388+
result?: unknown;
389+
cost?: number;
390+
error_message?: string;
391+
function_execution_id?: string;
392+
};
393+
394+
this.log('Tool execution event:', toolData.function_name, toolData.status);
395+
callbacks.onToolExecuting?.(toolData);
396+
break;
397+
}
398+
399+
case SSEEventType.ToolResult: {
400+
const toolResultData = event.data as {
401+
tool_call_id: string;
402+
result: unknown;
403+
error?: string;
404+
};
405+
406+
this.log('Tool result received for:', toolResultData.tool_call_id);
407+
callbacks.onToolResult?.(toolResultData);
408+
break;
409+
}
410+
371411
case SSEEventType.Error: {
372412
this.log('Received SSE error event:', event);
373413
const errorData = event.data as {

SDKs/Node/Core/src/chat/streaming/types.ts

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,16 +185,42 @@ export interface StreamMessageOptions extends SendMessageOptions {
185185
* Callbacks for UI integration
186186
*/
187187
export interface StreamingCallbacks {
188+
/** Called for each chat completion chunk received */
188189
onChunk?: (chunk: ChatCompletionChunk) => void;
190+
/** Called when content delta is received (cumulative content provided) */
189191
onContent?: (content: string, totalContent: string) => void;
192+
/** Called when reasoning/thinking content is received */
193+
onReasoning?: (reasoning: string, totalReasoning: string) => void;
194+
/** Called when tool execution status updates are received */
195+
onToolExecuting?: (event: {
196+
tool_call_id?: string;
197+
function_name?: string;
198+
status: string;
199+
result?: unknown;
200+
cost?: number;
201+
error_message?: string;
202+
function_execution_id?: string;
203+
}) => void;
204+
/** Called when individual tool results are received (optional, for detailed logging) */
205+
onToolResult?: (event: {
206+
tool_call_id: string;
207+
result: unknown;
208+
error?: string;
209+
}) => void;
210+
/** Called when performance metrics are received */
190211
onMetrics?: (metrics: StreamingPerformanceMetrics | MetricsEventData) => void;
212+
/** Called when tokens per second updates are available */
191213
onTokensPerSecond?: (tokensPerSecond: number) => void;
214+
/** Called when an error occurs during streaming */
192215
onError?: (error: StreamingError) => void;
216+
/** Called when streaming completes successfully */
193217
onComplete?: (response: {
194218
content: string;
195219
metadata?: MessageMetadata;
196220
}) => void;
221+
/** Called when streaming starts */
197222
onStart?: () => void;
223+
/** Called when streaming is aborted */
198224
onAbort?: () => void;
199225
}
200226

SDKs/Node/Core/src/chat/utils/sse-parser.ts

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,22 @@
55

66
/**
77
* SSE event types from Core API
8+
* Combines OpenAI-compatible standard events with Conduit-specific extensions
89
*/
910
export enum SSEEventType {
11+
/** Standard OpenAI content chunks containing delta updates */
1012
Content = 'content',
13+
/** Conduit extension: Model reasoning/thinking content separate from main response */
14+
Reasoning = 'reasoning',
15+
/** Conduit extension: Tool/function execution status and progress updates */
16+
ToolExecuting = 'tool-executing',
17+
/** Conduit extension: Individual tool execution results (optional, for detailed logging) */
18+
ToolResult = 'tool-result',
19+
/** Conduit extension: Real-time performance metrics during streaming */
1120
Metrics = 'metrics',
21+
/** Conduit extension: Final performance metrics at stream completion */
1222
MetricsFinal = 'metrics-final',
23+
/** Conduit extension: Error events during streaming */
1324
Error = 'error'
1425
}
1526

0 commit comments

Comments
 (0)