Brussels / 2 & 3 February 2013


Hangout-like video conferences with Jitsi and XMPP

This talk presents the Jitsi project's evolution of handling multi-party video conferencing. There are various approaches to handling multi-party video conferencing, and the Jitsi project approaches the problem in a fashion similar to Skype and Google. This has culminated in a new XMPP server component that focus agents can control via dedicated XMPP IQs: Jitsi Videobridge.

About a year ago the Jitsi project developers started work on support for video conference calls. We had had audio conferencing for a while at that point and we were using it regularly in our dev meetings. Video seemed like a logical next step so we rolled our sleeves and got to work.

The first choice that we needed to make was how to handle video distribution. The approach that we had been using for audio was for one of the participating Jitsi instances to mix all flows. That's easy to do for audio streams and any recent machine can easily mix a call with six or more participants.

Video however was a different story. Mixing video into composite images is an extremely expensive affair and one could never achieve this real-time with today's desktop or laptop computers. We had to choose between an approach where the conference organizer would simply switch to the active speaker or a solution where a central node would relay all streams to all participants, while every participant keeps sending a single stream.

We finally went for the latter which also seems to be the approach taken by Skype and Google for their respective conferencing services. We started by implementing all this in Jitsi but along the way we also decided to make the RTP relaying part a separate server-side component. This is how Jitsi Videobridge was born: an XMPP server component that focus agents can control via dedicated XMPP IQs.


Emil Ivov