Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Beta) Websocket closing unexpectedly on ReceiveUpdatesAsync #244

Open
camelCase12 opened this issue Oct 4, 2024 · 3 comments
Open

(Beta) Websocket closing unexpectedly on ReceiveUpdatesAsync #244

camelCase12 opened this issue Oct 4, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@camelCase12
Copy link

camelCase12 commented Oct 4, 2024

Service

OpenAI

Describe the bug

I was setting up the realtime API for voice chatting, but I found after my first question and reply from the model, the web socket would close and terminate the conversation. (This was while using CreateServerVoiceActivityTurnDetectionOptions)

When calling ReceiveUpdatesAsync to receive server messages, I didn't expect this to close the _clientWebSocket, but it did.

It took me a while to figure out that the AsyncWebsocketMessageResultEnumerator is automatically disposing of the _clientWebSocket when it finishes iterating, causing ReceiveUpdatesAsync to terminate the WebSocket. Commenting out the disposing call within AsyncWebsocketMessageResultEnumerator resulted in the expected behavior for me:

public ValueTask DisposeAsync()
{
    //_clientWebSocket?.Dispose();
    return new ValueTask(Task.CompletedTask);
}

This allowed me to have my expected 2-way conversation. If there's another intended method to receive server events without terminating the socket, or to keep the socket alive for more than one request-response, please let me know.

Steps to reproduce

  1. Initialize a RealtimeConversationSession with server voice activity turn detection:
var client = new RealtimeConversationClient(model: "gpt-4o-realtime-preview-2024-10-01", new(apiKey));


CancellationTokenSource cts = new();

var session = await client.StartConversationSessionAsync(cts.Token);

var options = new ConversationSessionOptions()
{
    Instructions = "<system prompt>",
    TurnDetectionOptions = ConversationTurnDetectionOptions.CreateServerVoiceActivityTurnDetectionOptions(0.5f, TimeSpan.FromMilliseconds(300), TimeSpan.FromMilliseconds(200)),
    Voice = ConversationVoice.Alloy,
    OutputAudioFormat = ConversationAudioFormat.Pcm16,
    InputTranscriptionOptions = new ConversationInputTranscriptionOptions()
    {
        Model = "whisper-1"
    }
};

await session.ConfigureSessionAsync(options);
  1. Begin sending audio through SendAudioAsync (in my case with NAudio Wave):
waveIn.DataAvailable += (s, a) =>
{
    using var memoryStream = new MemoryStream();
    memoryStream.Write(a.Buffer, 0, a.BytesRecorded);
    memoryStream.Position = 0;
    session.SendAudioAsync(memoryStream, token).Wait();
};
  1. Begin handling server responses with ReceiveUpdatesAsync in a loop:
while (true)
{
    await foreach (var update in session.ReceiveUpdatesAsync(token))
    {
        //Handle received updates
    }
}
  1. Make an audible request to the AI, and wait for its response to complete. On the second loop, ReceiveUpdatesAsync will throw a System.ObjectDisposedException:
Cannot access a disposed object.
Object name: 'System.Net.WebSockets.ClientWebSocket'.

Code snippets

No response

OS

winOS

.NET version

.NET 8 Core

Library version

2.1.0-beta.1

@camelCase12 camelCase12 added the bug Something isn't working label Oct 4, 2024
@edo4444
Copy link

edo4444 commented Oct 16, 2024

i dont know if it helps, but any unhandled exception on your code will terminate the websocket, been there.

@isaac-j-miller
Copy link

I am having the same issue. Because the AsyncWebsocketMessageResultEnumerator disposes the client websocket, I cannot do anything with the websocket after iterating through the responses. I don't think AsyncWebsocketMessageResultEnumerator should be responsible for disposing the client websocket because it doesn't create it. the RealtimeConversationSession class should solely be responsible for this, IMO

@isaac-j-miller
Copy link

I've opened PR #261 to address this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants