Skip to main content Skip to search
Emily Pali
Posted by
Emily Pali on
April 17, 2020

How to Successfully Deliver A Large-Scale Livestream Event

With larger online audiences than ever before, the livestreaming landscape is fraught with challenges from network congestion to technology failures. Running large-scale critical livestreaming events requires content owners to be nimble, remove single points of failure, and ensure redundancy and failover within their workflows to flawlessly deliver streams to end users over a myriad of devices and platforms.

There are many components involved in delivering a successful livestream. When livestreaming an event, the ultimate goal is to provide the best viewer experience, meaning the process flow needs to be optimized to reduce glass-to-glass latency or delays. To provide the most exceptional user experience, each step of the workflow -- capturing the event on camera, encoding, processing, and delivering -- must be optimized. There are many decisions that need to be made to optimize each step of the process, such as choosing the most effective streaming protocol and which is the best encoder, as well as where that infrastructure should reside.

This piece provides insights from streaming a large scale event and the optimizations that can be made for the delivery step in the livestream process so that content providers can ensure exceptional user experiences.

Implement a Multi-CDN Strategy

It’s not feasible to attempt to deliver a large, high profile event with a single Content Delivery Network (CDN). All CDNs, regardless of how well configured, can experience delivery errors under stress. A poor user streaming experience for highly anticipated premium events can draw negative attention and social media backlash. This backlash could potentially lead to a negative impact on brand equity, customer retention and perhaps the rights to future live events.

Using multiple CDNs for a large-scale livestream reduces the risk of poor user experience. Teams using a multi-CDN strategy can reserve capacity to ensure the CDNs can handle peak viewership as well as any traffic bursts expected during the livestream. These teams can also ensure that enough bandwidth has been put aside to take on a larger share of overall event traffic if one of the CDNs experiences an outage mid-event.

Choose the right infrastructure for your use case

How video teams decide to implement traffic steering depends on how much control they want over real-time decision making.

DNS-based decisions are a good option for teams using short time-to-lives (TTLs) because they are easier to set up operationally.

For teams that would like more control over their decisions and when they are made, HTTP-based decisions are a great way to get an ordered list of optimal CDNs for the manifest generator (or player) to consume and consider. This method is also beneficial for those supporting CDN token authentication or who have very strict latency requirements (and precise control over TTLs).

  • Locate your decision service near your publishing infrastructure
    Work with your Traffic Steering vendor to locate HTTP decision infrastructure near (or even in the same cloud region) as your infrastructure. For applications on a tight response latency budget, this is a great way to shave off hundreds of milliseconds from your response time.
  • Test live traffic before the big event
    It’s also important to test live traffic before the big event, even if it’s at a smaller scale than you’re anticipating. If there is a pre-season, try to start routing traffic with the configuration you’d like to use for the big day to test its effectiveness. You don’t want your team’s first time routing traffic to be during the event. Use the smaller events to work out kinks in the process and identify where you may need additional support or even additional CDNs.
  • Overprovision for periodic bursts
    Ensure you have enough capacity to handle sustained high rates of traffic and periodic bursts. Test how you can handle estimated overall and peak traffic. The system should be built to manage traffic at its anticipated highest level. This means setting up the infrastructure so that when it experiences the expected highest nominal traffic, there is still plenty of room for bursts.

A Recommended New Approach to Improve Digital Application Performance

Traffic Steering and the power in simplicity

Another important tip is to not overcomplicate your routing policies or depend too much on human intervention. Automating routing will optimize performance and provide the best streaming experience possible.

You want to ensure great performance for your users while not going over capacity reservations with your CDN partners. Automating this behavior should be simple with an understanding of availability and performance thresholds as well as commit. Automation provides several benefits for performance, including:

  • Staying under capacity reservations
    During the event, you’ll need a way to monitor how much capacity/throughput you’re using at each of the CDNs. This should include a policy that takes CDN capacity into account. You’ll want to be able to see in real time what that distribution looks like so you have levers if you want to distribute some traffic. By tracking your CDN allocations, the more options you have the less chance you’ll send all traffic to one CDN bringing it down. Automating monitoring of capacity/throughput usage at each CDN for use by your routing policy during the event also ensures you don’t overwhelm any one CDN or exceed the reserved capacity set aside for your event by shedding load/shifting traffic away from a CDN when approaching its reservation limit.
  • Ensuring the best performance for your users
    It is important to collect measurements for how users are experiencing the availability and throughput of each CDN. You can use real user monitoring (RUM) as well as quality of experience (QoE) metrics to understand how your users are experiencing the stream. Set performance thresholds that intelligently remove poorly performing CDNs based on a given user's geography and network to ensure optimal CDN selection that doesn’t interfere with CDN reservations. The reason it's important to only remove poorly performing answers instead of just choosing the best performer each time, is you don’t want wild traffic swings based on a few millisecond performance difference, you want stability in your CDN selection across different providers.

  • Keep CDN options open
    For large streaming events, craft a routing policy that keeps as many CDNs in rotation as possible. It may be fine to have a policy with strict performance expectations for regular events, but larger events require slightly more relaxed requirements. This is because having a good amount of redundancy is more important than overloading and bringing down.
  • Simple disaster recovery
    Don’t underestimate the need for simple levers. While your configuration should allow for automation to do all of the work, you’ll want to allow for the ability to easily make manual adjustments on game day as a disaster recovery plan. You don’t know what could happen during the live event, but you need to be ready to act. If you want to completely take a CDN out of rotation, use a one step lever to pull them out and be able to use the same lever to put them back in when you’re ready. You don’t want to have to write and deploy new code for a change like this on the fly.

When livestreaming an event, especially a highly anticipated or watched event, the user experience is of the utmost importance. Implementing a multi-CDN infrastructure and automating traffic steering can help ensure your livestream is seamless and viewers enjoy an exceptional experience.

Further Reading