-
Notifications
You must be signed in to change notification settings - Fork 348
Description
Service
OpenAI
Describe the bug
I was setting up the realtime API for voice chatting, but I found after my first question and reply from the model, the web socket would close and terminate the conversation. (This was while using CreateServerVoiceActivityTurnDetectionOptions)
When calling ReceiveUpdatesAsync to receive server messages, I didn't expect this to close the _clientWebSocket, but it did.
It took me a while to figure out that the AsyncWebsocketMessageResultEnumerator is automatically disposing of the _clientWebSocket when it finishes iterating, causing ReceiveUpdatesAsync to terminate the WebSocket. Commenting out the disposing call within AsyncWebsocketMessageResultEnumerator resulted in the expected behavior for me:
public ValueTask DisposeAsync()
{
//_clientWebSocket?.Dispose();
return new ValueTask(Task.CompletedTask);
}This allowed me to have my expected 2-way conversation. If there's another intended method to receive server events without terminating the socket, or to keep the socket alive for more than one request-response, please let me know.
Steps to reproduce
- Initialize a
RealtimeConversationSessionwith server voice activity turn detection:
var client = new RealtimeConversationClient(model: "gpt-4o-realtime-preview-2024-10-01", new(apiKey));
CancellationTokenSource cts = new();
var session = await client.StartConversationSessionAsync(cts.Token);
var options = new ConversationSessionOptions()
{
Instructions = "<system prompt>",
TurnDetectionOptions = ConversationTurnDetectionOptions.CreateServerVoiceActivityTurnDetectionOptions(0.5f, TimeSpan.FromMilliseconds(300), TimeSpan.FromMilliseconds(200)),
Voice = ConversationVoice.Alloy,
OutputAudioFormat = ConversationAudioFormat.Pcm16,
InputTranscriptionOptions = new ConversationInputTranscriptionOptions()
{
Model = "whisper-1"
}
};
await session.ConfigureSessionAsync(options);- Begin sending audio through
SendAudioAsync(in my case with NAudio Wave):
waveIn.DataAvailable += (s, a) =>
{
using var memoryStream = new MemoryStream();
memoryStream.Write(a.Buffer, 0, a.BytesRecorded);
memoryStream.Position = 0;
session.SendAudioAsync(memoryStream, token).Wait();
};- Begin handling server responses with
ReceiveUpdatesAsyncin a loop:
while (true)
{
await foreach (var update in session.ReceiveUpdatesAsync(token))
{
//Handle received updates
}
}- Make an audible request to the AI, and wait for its response to complete. On the second loop,
ReceiveUpdatesAsyncwill throw aSystem.ObjectDisposedException:
Cannot access a disposed object.
Object name: 'System.Net.WebSockets.ClientWebSocket'.Code snippets
No response
OS
winOS
.NET version
.NET 8 Core
Library version
2.1.0-beta.1