Networks are getting more complex and user demand is pushing it to the brink. But with the help of NS1, Wix has built resilient network infrastructure that allows us to scale like never before, and outsmart our vast and complex network.
My role is building network solutions that help Wix’s 225 million users build and maintain resilient websites. For some users, using Wix is the very first time building a website and for others, this is one of many sites within their expansive network.
Given the last 30 months, online search has skyrocketed and more websites have been created to deal with digital-only experiences. There have been more than 700 million new internet users since the start of the pandemic, according to the International Telecommunication Union.
With so many new internet users and Wix customers, our network has to be in top shape to ensure all digital experiences match expectations and exceed our customers’ dependency on us.
As Wix grows, we learn a lot from incidents through the years. Most importantly, we learn how network operations can be extended or shrunk as much as we want based on business needs and technological capabilities.
To effectively serve millions of customers and websites across the globe, we need to prioritize two things:
2. Efficient mitigation procedures
Outages are inevitable and we must prepare accordingly. If something goes wrong like connectivity issues from end-users, internal infrastructure issues, or any other unexpected factors, we must be in a constant state of resilience with 4-5 availability options. Distribution continues to play a big part in alleviating the way traffic navigates through our network.
You can never have too many availability options and quite frankly, we’re always thinking of the doomsday scenario if all our main options are unavailable. Trying to constantly improve, I wondered how I could set our network up in a way that continues to improve our reliability?
Prior to NS1, we were lacking additional ways to be resilient and didn’t have a good enough way to move our traffic around when outages arose.
Today, all traffic moves are achieved with internal Wix systems, signaling NS1 to take action on a Filter Chain level, which also has a doomsday protection mechanism.
As for setup itself: configuration management of zones, monitors and more is managed via Terraform and stored in git repositories. We use different types of NS1 Filter Chain monitors (including NS1 native as well as external systems via feeds) and we created our own combination of filter logic together with NS1 to ensure we satisfy company needs in terms of resilience and performance.
Out of all of this, Wix developed two systems that were deeply integrated into NS1 DNS response logic:
Traffic Light - controls Wix traffic across regions
Reactive Production - constantly monitors all available data centers; also triggers Traffic Light to perform traffic steering upon KPI violations
Traffic Light is a simple approach to move traffic between regions using API switch costs (check out the video below to learn more). I love this system because it’s simple, yet has the ability to use predefined monitors to control their status and move traffic freely.
To make sure all data centers are working how they should, Reactive Production measures parameters like response codes, load times, availability, etc. and triggers Traffic Light to do its job.
While Traffic Light and Reactive Production mitigate negative instances, it also allows us to ensure great performance, cost optimization and resource utilization across multiple CDN providers thanks to NS1’s Pulsar.
Our network is utterly complex. It has to be. As we continue to grow, we have a network that’s able to adapt with us and with our user’s growth.
Users want to search without network issues and business owners want a website platform that gives them the freedom to create, design, manage and develop their brand in a way they want.
NS1 adds resilience and traffic management through the Wix network. Resilience has always been a part of our DNA, so pairing with NS1 provided an additional layer of assurance and has helped us (and our users) outsmart complexity when digital demand is at its peak.
Missed out on our INS1GHTS2022 conference? Catch all of the sessions here: https://insights.ns1.com/ins1ghts2022.