Screen-Sharing with Asterisk's SFU
Whilst building the Dana project I wanted to add in the ability to screen share - it's pretty much a norm in any WebRTC conferencing application nowadays. What isn't so much the norm is the ability to send both your webcam video stream and your screen-share; usually applications will swap out your webcam feed for the screen-share; I dislike this entirely. For me, I'm still explaining myself, I'm still connecting with the other people in the conference call, I'm still using hand gestures etc. But I hit a snag - it didn't seem to work with Asterisk's SFU. Why?
getDisplayMedia
Let's take a step back here; how do we go about giving plugin and extension free access to Screen Sharing in WebRTC applications? We use the navigator.mediaDevices.getDisplayMedia API in the browser with some audio and video constraints; just like we would with getUserMedia.
Getting the stream should be as simple as
stream = await navigator.mediaDevices.getDisplayMedia({
video: {
cursor: 'always'
},
audio: false
});
You can see, I've set some extra constraints in the video object so that the cursor is always captured during a screen-share. How many times have you gestured around something on screen while sharing your screen but the other users can't see what you're doing? Most WebRTC conferencing applications miss this vital addition out for some reason. You can also say yes to audio; in Chrome and Firefox (I'm not sure about other browsers off the top of my head) this will allow you to share the audio from a Tab in the browser as an audio stream. The issue with Asterisk's SFU is that it mixes all the audio streams together so you can't send audio up that you'd then get back again - causing a painful experience for all involved.
I hear you saying "but surely it wouldn't come back, just like your voice doesn't come back to you because Asterisk is careful about what it mixes to who"? Ah yes, this would be the case if Asterisk could handle multiple audio and video tracks in the uploaded stream to Asterisk - it can't and so to be able to do both webcam and screen-sharing we create an entirely new Peer Connection stating we don't want to receive any streams, and that we just want to send one.
So we create a new Peer Connection purely for streaming up a screen-share, and we have an existing Peer Connection with my webcam - woohoo, that's it and it's super simple? Unfortunately not. Asterisk has this need for audio, the very core of Asterisk wants audio and so even though Asterisk accepts this screen-share video stream, it isn't able to forward it on as it has no audio. How do we get around this? We generate silence using Web Audio.
_createSilence() {
let ctx = new AudioContext(), oscillator = ctx.createOscillator();
let dst = oscillator.connect(ctx.createMediaStreamDestination());
oscillator.start();
return Object.assign(dst.stream.getAudioTracks()[0], {enabled: false});
}
stream = await navigator.mediaDevices.getDisplayMedia({
video: {
cursor: 'always'
},
audio: false
});
let silenceTrack = _createSilence();
stream.addTrack(silenceTrack);
First we have this silenceAudio function, which creates a new Audio Context and an oscillator from that, we then create a new media stream for the silence to go to and we start the oscillator - without passing anything into it we just get silence. Pretty nice eh. We then return the Silence Audio track with the key of enabled set to true. We take that track and add it to the existing screen-share media stream. Magically, Asterisk now takes our video stream of the screen-share and forwards it on correctly.
We now have to deal with the fact we're receiving the screen-share that we're sending on the other Peer Connection but that's a trivial job of not showing it because we know it's our screen-share.
If you're interested in following along there's an active issue in Asterisk's issue tracker for the issue of having to send audio even though we don't really have any
A big thanks to Lorenzo Miniero from the Meetecho team for helping me figure this one out!