By: Marine Sorin, Senior Product Manager and Julien Le Tanou, Video Research Manager
Imagine watching a live sports event where, with just a finger move, you can switch between camera angles, follow your favorite player, or even jump to a completely different game happening simultaneously. Sounds like a dream, right? Welcome to the exciting world of multiview streaming!
At MediaKind, creating unbeatable viewing experiences is at the core of what we do. That’s why we’ve been diving deep into the technology behind multiview streaming, studying how to bring viewers the most immersive and interactive moments. Let’s explore what multiview streaming is, why it’s thrilling for viewers, and the innovative tech that makes it all possible.
Why is multiview streaming appealing
Multiview streaming is more than just another cool feature—it’s a game-changer. It puts the viewer in the driver’s seat, giving them control over their experience. Let’s break down why this is a big deal.
Choice and Control
It empowers you to decide how and what you want to watch, whether it’s following your favorite athlete, switching between different camera angles, or even catching two live games side by side. Imagine the freedom of choosing exactly which player or game to focus on, never missing the moments that matter most. For sports enthusiasts, especially on action-packed days, this is a dream come true. Whether it’s multiple games happening at once or tracking key moments, multiview lets you create your own tailored viewing experience.
Enhanced Engagement
Multiview also amps up viewer engagement. Whether it’s watching from different angles during a soccer match or toggling between the lead and the peloton in a cycling race, you stay locked into the action that excites you. You can dive deep into the game, follow your favorite player, and experience those critical moments from every possible viewpoint. This level of immersion makes you feel like you’re part of the action, experiencing every play, cheer, and thrill from a front-row seat.
Beyond Sports
And it goes beyond sports. Multiview streaming opens up endless possibilities for live events—whether it’s following live news, monitoring weather updates, or watching a concert while keeping an eye on a sports game. The potential is limitless, offering a level of interactivity and personalization that redefines how we engage with content.
The tech behind multiview streaming
Now, you might think this all just happens magically, but pulling off multiview streaming is no easy feat. It’s a delicate balance between processing power, bandwidth, and interoperability. Let’s take a closer look at the tech making multiview possible and the trade-offs involved.
Server-side function: composition and one single stream encode
Popular with platforms like YouTube1, this method involves pre-encoding all possible view combinations upfront. It’s a straightforward and effective approach, especially for platforms that need to deliver multiple perspectives seamlessly. By delivering only one stream, it optimizes bandwidth usage. As a codec-agnostic solution that doesn’t disrupt existing DRM or delivery systems, this approach is simple to build and deploy. On the client side, the process is completely transparent—it uses just one decoder, making it compatible with all devices, including low-powered smart TVs and set-top boxes. It’s also fast to deploy and works within the existing ecosystem.
This method works well—sometimes simplicity is the best solution! It’s efficient for delivering multiple perspectives. It is worth noting that as the number of views N grow, we get (2N-1) multiview variants, which could become processing intensive if a lot of view are produced. In practice, most viewers find that watching 2 or 3 perspectives simultaneously is sufficient to appreciate the action from different angles, meaning not all possible combinations need to be created. This approach remains manageable and, when demand for multiview increases, it signals strong public interest!
If you want to dive deeper into this approach, visit our site for more details.
Edge Cloud Processing
In the Edge Cloud Processing approach, the Cloud manages the heavy lifting by encoding or transcoding personalized layouts on-demand for each individual viewer. This is like Cloud Gaming: the personalized experience is rendered and transcoded in the cloud, then delivered through a single stream per user. It remains codec-agnostic and maintains full DRM compatibility.
As with Cloud Gaming, scaling this solution can be costly since each stream must be individually processed for each user. Additionally, keeping video processing delays minimal to maintain interactivity raises network bandwidth requirements, increasing overall distribution costs.
Player side function: multi-decoders
This approach, used by Apple TV2 and others, delivers one stream per view, with multiple decoders on the client side managing the rest. This method scales effectively for many users, providing excellent user interaction and a high-quality experience without adding any extra processing complexity on the server side.
However, there are a few challenges. The performance is limited by the availability of decoders on the device, and syncing audio and video streams can sometimes be difficult. Additionally, since multiple decoders are competing for bandwidth, it can impact overall performance, especially in bandwidth-constrained environments.
Tiledmedia approach
Our friends at Tiledmedia have introduced an interesting twist to the player side approach3, leveraging specific properties of recent video codecs. Their player creates an interactive multiview composition from various video streams encoded in HEVC, with VVC or AV1 to follow in the future. Unlike the multi-decoder approach, their approach does not require multiple decoder instances and allows for seamless interaction while the system works in the background to optimize bandwidth in real time.
The key to Tiledmedia’s technology lies in “tiling,” a feature4 available with HEVC, VVC, and AV1 codecs. In simple terms, a tile in HEVC is like dividing a video stream into smaller, independent sub-streams. By using this codec property, Tiledmedia player is able to retrieve multiple streams and provide an interactive multiview experience with a single decoder instance. It allows viewers to choose and switch between various camera angles or video feeds without disrupting the overall video quality.
Here’s how the system works:
- On the encoding/origin side, all video streams are encoded using HEVC tiles syntax for every adaptive bitrate (ABR) representation. The streams must follow a few light requirements to enable smooth switching between different views. The packaging and delivery format remains unchanged, such as using HLS (HTTP Live Streaming) with optional low-latency CMAF.
- On the client side, Tiledmedia player SDK handles the multiview streams, with its own Adaptive Bit Rate logic, feeds them to a single decoder and then interactively display the tiles on the screen, letting the user move, resize and customize the layout as they wish. For completeness’ sake, we note that Tiledmedia also offers a single-player multi-decoder option for web players.
This tiling approach offers a highly interactive solution that provides viewers with a smooth, personalized experience. It comes with a few trade-offs though, as it requires the use of specific codecs like HEVC or AV1 and may present challenges for compatibility with certain digital rights management (DRM) systems and requirements. However, they have successfully overcome all DRM-related issues, ensuring a seamless and secure experience.
Take away
With a fixed number of layouts and a large diversity of devices – including legacy set-top-boxes – to support, the server-side approach is extremely efficient and easy to deploy.
For an even more interactive experience on personal devices, the player-side approach, and notably the tiled streaming approach introduced by Tiledmedia, give the outmost user control and flexibility.
These various technologies are not exclusive: as multiview becomes more and more popular, it is quite likely that both approaches get combined based on device, content and interactivity requirements.
The future of multiview streaming
Multiview streaming is setting a new standard for how we interact with content, giving viewers the control they crave and delivering experiences that go beyond passive watching. Whether you’re tracking every angle of a live sports match or juggling multiple events, this technology delivers more personalized and engaging experiences.
As it continues to evolve, multiview streaming is only going to get better. Imagine the day when you can personalize every event you watch, from sporting events and concerts to breaking news, all without sacrificing quality or speed. The future of viewing is in your hands, and it’s never been more exciting!
References
- [1] “Watch multiple events on one screen with multiview on YouTube”, Online at: https://support.google.com/youtube/answer/13780547?hl=en ↩︎
- [2] “Watch multiple live sports streams in the Apple TV app on iPad”, Online at: https://support.apple.com/guide/ipad/watch-multiple-live-sports-streams-ipadeb07ab65/ipados ↩︎
- [3] “Why Single-Player Multiview is Superior”, Online at: https://www.tiledmedia.com/everything-about-multiview ↩︎
- [4] Kiran Misra et al. “An Overview of Tiles in HEVC”, IEEE Journal of Selected Topics in Signal Processing (Volume: 7, Issue: 6, December 2013) ↩︎