-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for audio input in ChatMessageContentPart #292
Comments
Thank you for reaching out, @andhesky ! We are working to add audio support in chat completions soon. In the meantime, if you need a quick solution to get unblocked, you could try using the ChatClient client = new("gpt-4o-audio-preview", Environment.GetEnvironmentVariable("OPENAI_API_KEY"));
// Convert the input audio .wav file to a base64-encoded string.
string inputAudioFilePath = Path.Combine("Assets", "audio_user_message.wav");
using Stream inputAudioStream = File.OpenRead(inputAudioFilePath);
BinaryData inputAudioBytes = BinaryData.FromStream(inputAudioStream);
string base64EncodedInputAudioData = Convert.ToBase64String(inputAudioBytes.ToArray());
// Create and send the chat completion request.
BinaryData input = BinaryData.FromString($$"""
{
"model": "gpt-4o-audio-preview",
"modalities": ["text", "audio"],
"audio": {
"voice": "ash",
"format": "wav"
},
"messages": [
{
"role": "user",
"content": [
{
"type": "input_audio",
"input_audio": {
"data": "{{base64EncodedInputAudioData}}",
"format": "wav"
}
}
]
}
]
}
""");
using BinaryContent content = BinaryContent.Create(input);
ClientResult result = await client.CompleteChatAsync(content);
BinaryData output = result.GetRawResponse().Content;
// Parse the JSON response.
using JsonDocument outputAsJson = JsonDocument.Parse(output.ToString());
string data = outputAsJson.RootElement
.GetProperty("choices"u8)[0]
.GetProperty("message"u8)
.GetProperty("audio"u8)
.GetProperty("data"u8)
.GetString();
// Save the output audio to a .wav file.
BinaryData outputAudioBytes = BinaryData.FromBytes(Convert.FromBase64String(data));
using FileStream outputAudioStream = File.OpenWrite($"{Guid.NewGuid()}.wav");
outputAudioBytes.ToStream().CopyTo(outputAudioStream); |
This is now available in version 2.2.0-beta.1: We have an example here: Thank you! |
Confirm this is a feature request for the .NET library and not the underlying OpenAI API
Describe the feature or improvement you are requesting
With the announcement of support for input_audio as a content type for the chat completions API, it would be great to add support in C# to include audio in the ChatMessageContentPart. The realtime API is great, but some applications are better suited to the request/response nature of chat completion. However, without support in ChatMessageContentPart, I don't see how I can do so with the C# SDK.
Thanks.
Additional context
No response
The text was updated successfully, but these errors were encountered: