[audio] Add pcm audio websocket with dialog support #4032

GiviMAD · 2024-01-10T20:45:06Z

These PR adds a WebSocket Adapter that allows transferring PCM audio to a sink and source whose existence will be tied to the WebSocket connection (register/unregister the components on connection/disconnection) also allows to spawn a dialog processor instance connected to those (will also be tear down on disconnection).

The PR is incomplete and untested, I'll try to add a client to the UI and develop both PR in parallel.

Also IDK if placing the code in the voice bundle is correct, I did it there because it already requires the audio bundle. For me it makes more sense to do it on the audio bundle but I'll have to require both the voice and websocket bundles there.

Best regards!

kaikreuzer · 2024-01-13T09:34:55Z

Also IDK if placing the code in the voice bundle is correct

As the websocket adapter is for audio streams in general, I would say that it should ideally be placed in the audio bundle.
The dependency on the voice manager is very small; maybe you could come up with some small hook, which could be used by the voice bundle to register the dialog processing logic into the websocket adapter?

GiviMAD · 2024-01-13T18:58:10Z

Also IDK if placing the code in the voice bundle is correct

As the websocket adapter is for audio streams in general, I would say that it should ideally be placed in the audio bundle. The dependency on the voice manager is very small; maybe you could come up with some small hook, which could be used by the voice bundle to register the dialog processing logic into the websocket adapter?

Yes, having it on the voice bundle is quite unintuitive, I haven't though on that solution seems like the correct one.

Thank you for the feedback!

Signed-off-by: Miguel Álvarez <[email protected]>

GiviMAD · 2024-01-14T23:19:10Z

I have to refactor the code into the websocket.io bundle for it to build without modifying the BOM features, but I think it also makes sense to have it there with the other WebSocketAdapter implementations.

Signed-off-by: Miguel Álvarez <[email protected]>

GiviMAD · 2024-01-16T18:15:58Z

I think these is ready for the review.

I was able to write a POC in the UI and everything seems to work.

A quick video recorded with my mobile:

video_2024-01-16.mp4

Still need a lot of work on the UI side, I'm going to create a linked issue there asking for some guidance, I wanted to first have a POC to know there are no big issues with the integration.

Best regards!

Signed-off-by: Miguel Álvarez <[email protected]>

florian-h05 · 2024-11-02T09:33:17Z

@openhab/core-maintainers Would be great if this could be reviewed for openHAB 5, as it would allow us to have a voice assistant inside the UI.

florian-h05 · 2024-11-30T09:02:52Z

What would be nice is to have the same WebSocket API format as Home Assistant so we could use ESPHome voice satellites. I don’t know how feasible that is, I think I found HA‘s voice assistant WS endpoint implementation here: https://github.com/home-assistant/core/blob/dev/homeassistant/components/assist_pipeline/websocket_api.py

kaikreuzer · 2024-12-07T12:45:01Z

Would be great if this could be reviewed for openHAB 5, as it would allow us to have a voice assistant inside the UI.

I fully agree. Sorry @GiviMAD for having neglected that topic over the past months - I'll do my very best to drive this forward again for 5.0 (i.e. right after the 4.3 release)!

openhab-bot · 2024-12-17T07:18:00Z

This pull request has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/creation-an-voice-audio-satellite-with-the-help-of-an-esp32/160591/3

Signed-off-by: Miguel Álvarez Díez <[email protected]>

GiviMAD · 2025-01-08T22:19:09Z

Hey @kaikreuzer and @florian-h05 sorry for the late reply. I have rebased the PR and done some changes to simplify its usage. Basically I have moved the audio conversion to the server. If you have any time to review it will be great having it available on the snapshots.

Also @dalgwen, I forgot to mention you here but if you can take a look in case you see something weird, it will be great.

Best regards!

florian-h05 · 2025-01-09T09:13:37Z

Thanks!

Did you have a look at HA‘s voice protocol implementation? API compatibility on the openHAB side would have the benefit that we could make use of HA‘s work in voice satellites, which would make things a lot easier for us as we would have voice satellites already available without the need to take care of developing them.

GiviMAD · 2025-01-09T10:50:00Z

Thanks!

Did you have a look at HA‘s voice protocol implementation? API compatibility on the openHAB side would have the benefit that we could make use of HA‘s work in voice satellites, which would make things a lot easier for us as we would have voice satellites already available without the need to take care of developing them.

Yes, forgot to answer that sorry.

I will prefer to keep to the current implementation, over all because after the latest changes I think its easy to consume. As far as I see they use the ws binary protocol to transmit the source data and they require it to have a specific spec (16000hz, 16bits, 1channel) and the string protocol to send the sink data because they need to transmit identity of the stream and the spec of the format, that requires them to convert the data to base64 so it don't get corrupted. I use 8 bytes as a header for each chunk of audio data and always use the binary format to send the audio data. So the client can choose what spec send to the source it just needs to generate the correct header for the data, and the sink work in a similar way, you remove the first 8 bytes and they contain the format and the stream identifier. I also allow to force the format of the sink so you don't have to make the audio conversion and force a specific spec for the sink data.

Apart from that I think that supporting those devices will be nice, but I think is better to have it in a specific bundle that expose that exact ws protocol and also allow you to discover the connected devices to interact with them through oh instead of as part of the core. I need to look for one of those devices to see if I can do it.

florian-h05 · 2025-01-09T17:56:14Z

Thanks for the explanation - I'm fine with that , just wanted to know what you think about it.

Signed-off-by: Miguel Álvarez Díez <[email protected]>

GiviMAD requested a review from a team as a code owner January 10, 2024 20:45

GiviMAD mentioned this pull request Jan 13, 2024

Upgrade from webpack v4 to v5 openhab/openhab-webui#2267

Merged

[audio] Add pcm audio websocket with dialog support

17970d5

Signed-off-by: Miguel Álvarez <[email protected]>

GiviMAD force-pushed the voice/audio_pcm_websocket branch from 11d3025 to 17970d5 Compare January 14, 2024 11:33

GiviMAD changed the title ~~[WIP] [audio/voice] Add pcm audio websocket with dialog support~~ [WIP] [audio] Add pcm audio websocket with dialog support Jan 14, 2024

GiviMAD added 4 commits January 14, 2024 14:55

improve KSEdgeService support

c5a3ec7

Signed-off-by: Miguel Álvarez <[email protected]>

fix/spotless voice provider

be537e4

Signed-off-by: Miguel Álvarez <[email protected]>

fix dep version

af83a81

Signed-off-by: Miguel Álvarez <[email protected]>

refactor code into websocket io

cdc08a9

Signed-off-by: Miguel Álvarez <[email protected]>

GiviMAD added 2 commits January 16, 2024 11:57

fixes

74d2457

Signed-off-by: Miguel Álvarez <[email protected]>

fixes and improvements

d356b6d

Signed-off-by: Miguel Álvarez <[email protected]>

wborn added the enhancement An enhancement or new feature of the Core label Jan 16, 2024

improvements

ad559e6

Signed-off-by: Miguel Álvarez <[email protected]>

GiviMAD changed the title ~~[WIP] [audio] Add pcm audio websocket with dialog support~~ [audio] Add pcm audio websocket with dialog support Jan 16, 2024

fix comments

2e808a4

Signed-off-by: Miguel Álvarez <[email protected]>

GiviMAD mentioned this pull request Jan 17, 2024

[Main UI] Voice support openhab/openhab-webui#2275

Open

GiviMAD mentioned this pull request Jan 26, 2024

[WIP] [UI] Add voice dialog openhab/openhab-webui#2285

Open

Merge branch 'main' into voice/audio_pcm_websocket

dc9aa43

florian-h05 mentioned this pull request Nov 30, 2024

Feature Planning for openHAB 5 openhab/openhab-distro#1693

Open

GiviMAD added 2 commits January 8, 2025 16:49

Merge branch 'main' into voice/audio_pcm_websocket

0866a5a

[audio] Add pcm audio websocket with dialog support

779f5c8

Signed-off-by: Miguel Álvarez Díez <[email protected]>

GiviMAD force-pushed the voice/audio_pcm_websocket branch from 4c5af80 to 779f5c8 Compare January 8, 2025 21:48

GiviMAD force-pushed the voice/audio_pcm_websocket branch from 43cef7f to f5c22ae Compare January 9, 2025 17:21

sink fix buffer size for i24

631153e

Signed-off-by: Miguel Álvarez Díez <[email protected]>

GiviMAD force-pushed the voice/audio_pcm_websocket branch from f5c22ae to 631153e Compare January 9, 2025 23:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[audio] Add pcm audio websocket with dialog support #4032

[audio] Add pcm audio websocket with dialog support #4032

GiviMAD commented Jan 10, 2024 •

edited

Loading

kaikreuzer commented Jan 13, 2024

GiviMAD commented Jan 13, 2024

GiviMAD commented Jan 14, 2024

GiviMAD commented Jan 16, 2024

florian-h05 commented Nov 2, 2024

florian-h05 commented Nov 30, 2024

kaikreuzer commented Dec 7, 2024

openhab-bot commented Dec 17, 2024

GiviMAD commented Jan 8, 2025

florian-h05 commented Jan 9, 2025

GiviMAD commented Jan 9, 2025 •

edited

Loading

florian-h05 commented Jan 9, 2025

[audio] Add pcm audio websocket with dialog support #4032

Are you sure you want to change the base?

[audio] Add pcm audio websocket with dialog support #4032

Conversation

GiviMAD commented Jan 10, 2024 • edited Loading

kaikreuzer commented Jan 13, 2024

GiviMAD commented Jan 13, 2024

GiviMAD commented Jan 14, 2024

GiviMAD commented Jan 16, 2024

florian-h05 commented Nov 2, 2024

florian-h05 commented Nov 30, 2024

kaikreuzer commented Dec 7, 2024

openhab-bot commented Dec 17, 2024

GiviMAD commented Jan 8, 2025

florian-h05 commented Jan 9, 2025

GiviMAD commented Jan 9, 2025 • edited Loading

florian-h05 commented Jan 9, 2025

GiviMAD commented Jan 10, 2024 •

edited

Loading

GiviMAD commented Jan 9, 2025 •

edited

Loading