Design Netflix (System Design)

Design Netflix (System Design)
design netflix

Hi there! In this blog post we will see about "Design Netflix". By design, I mean not the current complete system architecture that Netflix has, instead we will look at the core part of it on how to design. Each person has a different idea or view to tackle this question. Here, I will present my view where I talk about the core aspects of Netflix. It is just a solution, not the perfect answer. So let's get started!

Netflix seems like a pretty straightforward service. Users go to the platform, they're served with video content such as movies and TV shows where they can watch them. So this is the core Netflix product. There are many auxiliary services that are part of Netflix, like authentication. Users can create an account, they can log on and there is profile and payment service. Users can subscribe to Netflix and pay with credit cards, netbanking etc. As I said, we will see only core part of it and a recommendation engine as we are gonna have access to a lot of user activity data. So we need to come up with a way to aggregate that user activity on the application, and be able to process it into say a set of scores for the content.

Based on this, it makes sense to divide this entire system into two major subsystems.

  1. Core user-video flow
  2. Recommendation Engine

The former is anything that involves like storing videos, streaming videos, user metadata and static content of videos such as name, description of videos. The latter is kind of tangential to core user-video flow.

For core user-video flow, we need to care about latencies because users want to stream video, basically in real time very fast. So we also care about high availability of the system and consider as global audience and not regional. And as recommendation engine happens asynchronously, we don't have to care for latencies. Netflix has around 150M - 200M (million) users. For safer side, we will assume 200M users. Even though Netflix is used on different clients, like on desktop computer, tablets or phones, we will focus on distributed system aspect. We don't get into the detail about other clients.

1. Core User-Video Flow

As already mentioned above, we care about three pieces of data which are more relevant to the core user-video flow.

Data:

  • Video Content
  • User Metadata (watched/liked/how long or how much watched)
  • Static Content (name, description etc.)

Storage Solution:

If we do a little bit of estimation for how much storage we're gonna need, because we have to figure out what storage solutions we're going to go with. Let's start with video.

1. Video Content

So, for the video, Netflix didn't have many movies as Netflix is not a service like YouTube where any user can upload unlimited amount of video content. Netflix has a very limited amount of video content that's picked by them. So it is fair to estimate that Netflix has like 10,000 movies (just to be safe in terms of our estimates).

Netflix has multiple resolutions for videos (video qualities), for example Standard Definition, High Definition and 4K. So we need to store every video in all those resolutions. For now let's go with Standard and High Definition. Video on an average is probably like an hour in length, because movies are around 2 hours, and TV shows are in between 20 - 30 minutes. An hour of video in Standard Definition is probably like, 10 gigabytes per hour (10GB/hr) and in High Definition let's take double of Standard Definition, so 20GB/hr. That means 30GB/hr for every single video. As mentioned above that Netflix has 10k movies, so 10,000 x 30 GB = 3,00,000 GB = 300 TB. This is not actually much data compared to YouTube or Google Drive. This is considerably lower. So, we don't need to optimze with our video storage solution. And we can probably get away with just storing our video content in a blob storage such as S3 or Google Cloud Storage.

2. Static Content

The static content is relevant to videos and is just a text and going to take up way less storage than video. So static content is bounded by 300TB. We can probably store in either some sort of key-value store or maybe in relational database like Postgres.

3. User Metadata

For user metadata, if video content takes only 300TB, we might actually have more user metadata than video content because we have got 200M users. Let's try to estimate it. We are going to be storing user metadata about 200M users, probably per video (at video level). Because this user metadata is gonna be something like, whether a user has watched the video, so that would be kind of boolean. How long or how much of a video they have watched. So how many videos do we estimate a user will watch in their lifetime? Because, we are trying to estimate like how much storage we need, just for all of user metadata across all Netflix, in perpetuity. So let's assume maybe that a single user watches, one video or two videos a week on Netflix and two shows. If a user watches two shows a week, let's say 50 weeks in a year (52 actualy). So 100 shows per year. Let's take an average lifetime for a user, like 10 years on Netflix because Netflix has only been around for 20 years. Let's take 10 years, 100 shows per year. That's 10 x 100 = 1000 shows in a lifetime. So we are gonna be storing per user, a user metadata on 1000 videos.

Now, how big is this user metadata? It doesn't seem like it's that big. Like if the metadata is something like whether user is watched, how long of the video has been watched, we are talking in the order of bytes, like 10 - 100 bytes per video. So, let's go with 100 bytes per video per user, just because we don't know what kind of overhead we might have with storage medium. So 100 bytes per video per user and I said 1000 videos per user in a lifetime. So 100 bytes x 1000 videos = 100 kilo bytes per user (100KB/user).

We have 200M users, so 100KB/user x 200M users = 1GB x 20K = 20TB of user metadata. This is not much data, we can probably just store it in a relational database like Postgres. Because we are likely gonna query this data pretty often.

Netflix users by nature are isolated, and its users are the only people for whom we care about latencies, we can just shard whatever relational database stores this user metadata at the user ID level. So imagine, we have a relational database like Postgres, for user metadata, we can probably sort it into a few shards, maybe 5 - 10 shards, where each shard has 2 - 4 TB of data and these shards would be based on user ID.

For static content like I mentioned above, we can just put in another relational database.

So that's it for storage aspect. It is less complicated than we were expecting. The complicated thing will actually streaming or serving this data, and not the static data and not the user metadata, but the video because the core Netflix service is being able to watch videos and we care about very fast latencies globally. 

We have got 200M users, so it makes sense to estimate like our bandwidth consumption at any given point in time, and to see like, how much data we're gonna need to store.

Let's suppose they are not gonna be all watching at once. We can assume probably that during peak time, maybe 1% of Netflix users are watching a show, maybe if the shows are of super popular, maybe we have got 5% of the global Netflix user base watching a video at once.

So 5% of 200M is 10M users, so we will assume 10M users is the peak traffic. 10M users watching a video, let's say High Definition video, just to be conservative. That means 10M users need 20GB/hr. So how many gigabytes per second? Let's say that there are 4000 (3600 actually) seconds in an hour (rounding up for easy calculations). So 20GB/4000 = 5GB/1000 = 5MB/second (5 megabytes per second) and this is per user, and we have got 10M users at peak traffic. So 5MB/sec x 10M = 50MB/sec bandwidth consumption.

This seems like a lot, so that is not handleable by a single data center or few data centers. So we are gonna want to spread out this load across, probably like a ton of machines. And even more specifically, a ton of locations. It's not like a single data center having bunch of machines, because that's still like one bottleneck at the data center level as far as bandwidth goes. We want this to spread out over or across a bunch of different locations. So we might have to use CDN for the video serving like CloudFare, they key point is that it has to have a lot of Points-of-Presence (PoPs) and these PoPs will probably consist of thousdands of hundres of machines. That way we can bring down like if we have got a CDN with 1000 PoPs, we bring down the bandwidth consumption to the order gigabytes per second. And if the PoPs have themselves like clusters of machines, then we can bring it down to the order of like megabytes per second. 

But how do we kind of make sure the PoPs have the right data locally? So, you definitely need some caching. Our PoPs here, need to have video content cached. If we have a service that lives in between CDN and blob store that basically asynchronously populate these PoPs with the video content that they need to have. So you can imagine that if Netflix is releasing new movies, then these services, let's call it a cache populator, which will make sure to have these new movies get sent to these PoPs. And it can probably like prioritize important movies, deprioritize unpopular movies.

So this is how we achieve video streaming and high bandwith consumption. Netflix in real life solves the issue by partnering with Internet Service providers themselves. Instead of using classic kind of CDN, they partner with the Verizon and the AT&Ts of the world and inject a special caching layer crafted by Netflix, at locations that are called Internet Exchange Points (IXPs). And these are kind of counter part to the PoPs for a CDN. It behaves essentially like an optimized CDN, which allows for a much cheaper bandwidth for Netflix and the users obviously have much lower latency as well. This probably wraps it up for the core user-video flow.

We need to think about how we are actually updating the user metadata and even getting static content and we have user requesting a video content from CDN and from the blob store eventually. We probably gonna have a layer of API servers here.

2. Recommendation Engine

We said we might want to gather more granular user activity, something like logs. We assume these gets fed into our recommendation engine or maybe not necessarily the recommendation engine, but the system that backs it. Like we have some sort of MapReduce job that asynchronously consumes all these logs that are stored some where and then spits out some sort of more meaningful data that is used elsewhere in our system.

Since, we're gonna be performing MapReduce jobs, on this data, we're gonna need a distributed file system. So we can store our logs in a distributed file system like the Hadoop Distributed File System (HDFS). And basically like if these are our users here, they're sending logs all the time to HDFS.

And then we have got this asynchronously. MapReduce job that takes the logs from HDFS, and actually process the data (logs). So if those logs look like, let's say like a raw data. For example, see below:

{
  userId: "uID1",
  event: TYPE, ----> an enum kind -> granular event like "click, watch" that says user engagement
  videoId: "vidID1"
}

So if this is our log, they are stored in the HDFS, then we feed them into our map function. Our map function takes these logs as inputs, and then it probably aggregate them at the user ID level, because we said that this is a recommendation engine, and probably wanna spit out like a score or something for the user or fetch video. So it makes sense to aggregate, to have the map function output intermediary key-value pairs that are based on the userId. For example, see below:

uID1: [(vidId1m pause), (vidId2, play)]

The above key-value pairs is fed into reduced function, we are not sure how this works in real system like Netflix, we can assume that they might have, machine learning models that are getting trained on this intermediary data or some sort of data pipeline that grabs the data in the middle of the reduce function. But eventually, you might spit out the final score that we are looking for. So, this might be something like a userId pointing to a video and a score or maybe it might be a userId pointing to like a stack ranking of videos.

uId1: [v1, v2, v3]
(or)
uId1: (v1, score)

Depending on your use case or depending on what data you think is meaningful for your recommendatione engine, you could write your reduce function in various ways.

So that's it for recommendation engine and core design of Netflix. Thanks for reading. Until next time!