Small WebRTC transport for Android #18

marcus-daily · 2025-05-06T11:59:36Z

No description provided.

filipi87 · 2025-05-08T15:12:24Z

README.md

+
+val connectionUrl = "http://localhost:7860/api/offer"
+
+val client = RTVIClient(SmallWebRTCTransport.Factory(context, connectionUrl), callbacks, options)


Maybe in the future, we’ll need to change this so we receive the connectionUrl as a result from the authBundle, kind of the same way it’s happening with Daily right now.
But if we get to that point, we’ll also need to change the iOS SDK and Web SDK.
For now, this is the way to go.

filipi87 · 2025-05-08T15:14:59Z

...ebrtc/src/main/java/ai/pipecat/client/openai_realtime_webrtc/OpenAIConversationItemCreate.kt

 import kotlinx.serialization.Serializable

 @Serializable
+@ConsistentCopyVisibility


Just double checking if these changes are intentional.

Yes, this was a fix for a Kotlin compiler warning

filipi87 · 2025-05-08T15:29:04Z

...rtc-transport/src/main/java/ai/pipecat/client/small_webrtc_transport/SmallWebRTCTransport.kt

+        private const val TAG = "SmallWebRTCTransport"
+    }
+
+    object AudioDevices {


Should we also support Bluetooth and Wired Headsets ?

Since Android provides APIs for managing this, I think it's best to avoid trying to abstract this in the transport and potentially conflict with what the user is doing. Even the "speakerphone" device is just setting a single boolean in the audio service, which it would probably be better for the user to do themselves.

filipi87 · 2025-05-08T15:36:17Z

...rtc-transport/src/main/java/ai/pipecat/client/small_webrtc_transport/SmallWebRTCTransport.kt

+    private var client: WebRTCClient? = null
+    private var selectedCam = CameraMode.Front
+
+    override fun initDevices(): Future<Unit, RTVIError> = resolvedPromiseOk(thread, Unit)


Are we going to implement it to initialize devices state and report initial available and selected devices ?
This is what I am triggering on iOS.

With the current audio devices implementation there shouldn't be anything to query.

filipi87 · 2025-05-08T15:37:57Z

...rtc-transport/src/main/java/ai/pipecat/client/small_webrtc_transport/SmallWebRTCTransport.kt

+
+
+            transportContext.callbacks.onInputsUpdated(
+                camera = false,


Should we respect the option if the camera is enabled or not ?

Yes, actually I think this call can be removed entirely.

filipi87 · 2025-05-08T16:06:36Z

...rtc-transport/src/main/java/ai/pipecat/client/small_webrtc_transport/SmallWebRTCTransport.kt

+                        if (msgWithType.type == "signalling") {
+                            val msg = JSON_INSTANCE.decodeFromJsonElement<SignallingEvent>(msgJson)
+
+                            when (msg.message.type) {


We have also the "renegotiate" message, where the server asks the client to renegotiate.

I don't think this message is currently used on iOS (and neither is peerLeft IIRC)

The renegotiate one, it is:

https://github.com/pipecat-ai/pipecat-client-ios-small-webrtc/blob/a6e4516b1fcbed772ca97a9616dddc9329097958/Sources/PipecatClientIOSSmallWebrtc/SmallWebRTCTransport.swift#L369

The peerLeft, not yet, because I have not implemented on iOS the reconnection logic yet, which we currently have on web. This is currently something pending in my TODO list. 🙂

Thanks, now implemented!

filipi87 · 2025-05-08T16:08:52Z

...rtc-transport/src/main/java/ai/pipecat/client/small_webrtc_transport/SmallWebRTCTransport.kt

+                )
+            }
+
+            enableMic(transportContext.options.enableMic)


Do we need to do the same for the camera ?

Fixed, this line actually needed removing

filipi87 · 2025-05-08T17:52:45Z

...mall-webrtc-transport/src/main/java/ai/pipecat/client/small_webrtc_transport/WebRTCClient.kt

+
+        Log.i(TAG, "Creating PeerConnection")
+
+        val iceServers = ArrayList<PeerConnection.IceServer>()


It would be interesting to allow the users to configure the iceServers.

I see that on iOS we let the user configure the URLs, however WebRTC exposes a number of other params:

public Builder setUsername(String username) { this.username = username; return this; } public Builder setPassword(String password) { this.password = password; return this; } public Builder setTlsCertPolicy(TlsCertPolicy tlsCertPolicy) { this.tlsCertPolicy = tlsCertPolicy; return this; } public Builder setHostname(String hostname) { this.hostname = hostname; return this; } public Builder setTlsAlpnProtocols(List<String> tlsAlpnProtocols) { this.tlsAlpnProtocols = tlsAlpnProtocols; return this; } public Builder setTlsEllipticCurves(List<String> tlsEllipticCurves) { this.tlsEllipticCurves = tlsEllipticCurves; return this; }

It might be best to wait until someone needs this functionality and then do a more comprehensive implementation.

Yes, this is probably something we will need to add there soon enough. But I agree, we can add that in a follow up PR.

filipi87 · 2025-05-08T18:05:26Z

...mall-webrtc-transport/src/main/java/ai/pipecat/client/small_webrtc_transport/WebRTCClient.kt

+        localAudioTrack = peerConnectionFactory.createAudioTrack("mic", audioSource).apply {
+            TrackRegistry.add(this)
+            audioSender = peerConnection.addTrack(this)
+        }
+
+        localVideoTrack = peerConnectionFactory.createVideoTrack("cam", videoSource).apply {
+            TrackRegistry.add(this)
+            videoSender = peerConnection.addTrack(this)
+        }


The SmallWebRTCTransport on the server side, it is always expecting that we are going to work with transceivers.

So, the first transceiver should send the audio track.

The second transceiver should send the video track.

Like we are creating it here on iOS:
https://github.com/pipecat-ai/pipecat-client-ios-small-webrtc/blob/a6e4516b1fcbed772ca97a9616dddc9329097958/Sources/PipecatClientIOSSmallWebrtc/SmallWebRTCConnection.swift#L244C18-L251

This allow us to replace the tracks later without the need to renegotiate.

The transceiver order is always respecting. So for example, this is how we try to get the audio and video input tracks on Pipecat:

https://github.com/pipecat-ai/pipecat/blob/7280e390d95e129627d4f6c33ce6d316e991f3f5/src/pipecat/transports/network/webrtc_connection.py#L335-L362

Currently the client doesn't replace tracks, and as far as I could tell the server doesn't support replacing tracks either. The audio/video tracks are fetched once in _handle_client_connected(), and then never again:

async def _handle_client_connected(self): # There is nothing to do here yet, the pipeline is still not ready if not self._params: return self._audio_input_track = self._webrtc_connection.audio_input_track() self._video_input_track = self._webrtc_connection.video_input_track()

So I don't think right now there would be any benefit from creating the transceivers manually. Even so, I'll see if it's a quick change and do it if so.

As a side note, I think it's risky/fragile to rely on the transceiver order being preserved between the server and client. Ideally the server should dynamically detect tracks being started/stopped to avoid being dependent on this.

Currently the client doesn't replace tracks, and as far as I could tell the server doesn't support replacing tracks either. The audio/video tracks are fetched once in _handle_client_connected(), and then never again:

When you're using transceivers, you can replace the track on the client side, and the server will continue receiving it under the same "mid". So, it just works. You don't need a new SDP renegotiation between the peers.

For example, you can start without sending a video track and later enable your camera to begin sending one. Since you're simply replacing the track on the transceiver, this process is transparent from the server's perspective.

One thing I've already implemented on the web, but not yet on iOS, is a signaling message to report the track’s status (whether it's enabled or not):

https://github.com/pipecat-ai/pipecat-client-web-transports/blob/eb718ca43001bfc1ee51426dcdc3640d595d0d0d/transports/small-webrtc-transport/src/smallWebRTCTransport.ts#L176-L190

This shouldn't be necessary. In browsers, for example, if a track isn’t sending data, it automatically changes its state to muted. But this doesn’t happen with aiortc, so I had to implement a signaling message to explicitly inform aiortc when a track is enabled or not.

Thanks, I've modified the code to create the transceivers explicitly, and only create the tracks as needed. Can confirm that replacing the track on the transceiver seems to work.

filipi87 · 2025-05-08T18:08:27Z

...mall-webrtc-transport/src/main/java/ai/pipecat/client/small_webrtc_transport/WebRTCClient.kt

+            audioSender = peerConnection.addTrack(this)
+        }
+
+        localVideoTrack = peerConnectionFactory.createVideoTrack("cam", videoSource).apply {


I believe we should only create the video track if enableCam is true.

Same for the audio, create it or not based on the enableMic.

I think enableCam and enableMic are just the initial states, and currently on iOS, setting them to false stops the client from ever sending those tracks. On Android we deliberately create the tracks regardless of whether they're hooked up to a source, to work around the server not being able to handle track changes later on.

I.e. if enableCam is initially set to false, but later the user calls showVideo(), this currently won't work on iOS.

I.e. if enableCam is initially set to false, but later the user calls showVideo(), this currently won't work on iOS.

If this is happening, this is an issue that we need to fix on iOS.

On Android we deliberately create the tracks regardless of whether they're hooked up to a source, to work around the server not being able to handle track changes later on.

You can change the tracks later on, which is why we're using transceivers on the server. We're using two transceivers. Currently one for audio and one for video. And you can simply replace the track on each transceiver.

You don't need to have a track to create a transceiver. You can just specify the kind ("audio" or "video").

Thanks, I've changed this. We should definitely revisit iOS though as, as far as I can tell, the tracks are only ever created in SmallWebRTCConnection.init -> createMediaSenders(), and only if self.enableCam is set (which means that if it's initially false, then there will never be a track).

Yep, this is on my todo list. We need to fix it there for sure.

marcus-daily · 2025-05-29T12:34:35Z

Thanks for the comments @filipi87! This is ready for re-review.

filipi87

Thanks for applying the changes, @marcus-daily. Looks great! 🚀

marcus-daily · 2025-05-30T10:10:32Z

Great, thanks @filipi87!

marcus-daily added 2 commits May 6, 2025 12:57

Small WebRTC transport for Android

232ef62

Fix issue with disabling camera

45300d3

marcus-daily requested a review from filipi87 May 6, 2025 13:49

marcus-daily marked this pull request as ready for review May 6, 2025 13:49

filipi87 reviewed May 8, 2025

View reviewed changes

marcus-daily added 5 commits May 29, 2025 12:43

Create RtpTransceivers explicitly

ae06961

Only create tracks as needed

0577ca9

Remove unnecessary call to onInputsUpdated

aae770c

Remove unnecessary call to enableMic

377aed0

Renegotiate message support

ce3ee92

marcus-daily requested a review from filipi87 May 29, 2025 12:34

filipi87 approved these changes May 29, 2025

View reviewed changes

marcus-daily merged commit a152231 into main May 30, 2025
1 check passed


		val connectionUrl = "http://localhost:7860/api/offer"

		val client = RTVIClient(SmallWebRTCTransport.Factory(context, connectionUrl), callbacks, options)


		Log.i(TAG, "Creating PeerConnection")

		val iceServers = ArrayList<PeerConnection.IceServer>()

Small WebRTC transport for Android #18

Small WebRTC transport for Android #18

Uh oh!

Conversation

marcus-daily commented May 6, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

filipi87 May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

marcus-daily commented May 29, 2025

Uh oh!

filipi87 left a comment

Choose a reason for hiding this comment

Uh oh!

marcus-daily commented May 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

filipi87 May 8, 2025 •

edited

Loading