Netflix Gives Analytics in the Cloud a Starring Role
1. Netflix Gives Analytics in
the Cloud a Starring Role
Vishal Jain
Manager (Data Platform), Netflix Inc.
2. • Multi-days outage in our Data Center infrastructure
• New challenges of our growing streaming business
• Ever increasing data volume and international expansion
• Undifferentiated heavy lifting
Evolution of Netflix Cloud Infrastructure
3. > Grow naturally
> Spikes
> Reservations
> Spot
Why we adopted the Cloud?
Elasticity
4. > Multi-Data center
> International
Why we adopted the Cloud?
Redundancy / Global Presence
5. > Available
> Storage, Queue, Processing
> Netflix OSS
> Focus
Why we adopted the Cloud?
Services
6. • Cost?
• Never Ending Fight
• Data Center Mindset
• Bigger Fish
• Specialized Hardware
Cloud Considerations
7. • Amazon Redshift
• Teradata Cloud
Data Warehouse in Cloud
Cloud Integration
Cloud Expertise
DW Maturity
Our Codebase
8. • One time data migration
• Exactly same codebase running in cloud
• Capacity On Demand
• Network peering with AWS & Netflix
• VPN/VPC infrastructure
Netflix Teradata Cloud
----- Meeting Notes (8/20/14 15:40) -----
Good Morning everyone and Welcome to this morning’s session. Hope you all are having a great conference so far.
My name is Vishal and I’m part of Netflix Data Platform team where we manage Netflix’s Teradata Infrastructure.
In this presentation I will talk about what led us to adopt Cloud as a platform and how we went about evaluating Teradata Cloud. I’ll be ending the discussion by giving a high level overview of our current Data Platform layout.
Full disclosure; This presentation was earlier given by Kurt Brown as a webinar.
The webinar was very well received and I thought it would be useful to present it again in a bigger forum like partners..
In the interest of time I have made some changes to the original slides but the overall message remains the same..
In 2008 we had a multi days outage in our infrastructure. What led to that outage isn’t that important but what was important was that we were made aware of the fact that our datacenter based infrastructure isn’t sufficient to serve our business especially the growth plans that company had that time.
We had two options at that time, scale the datacenter to meet our needs or to move totally away from it and adopt a still a new concept of Cloud platform as a service. We decided on later.
It was not a trivial decision for us. We knew we don’t want to be in the business of maintaining data centers, we had ambitious growth plan.
Our business was changing and Netflix was moving into the business of streaming movies. Data volume, velocity and variety suddenly changed when compared to our DVD data.
From a few millions of DVD shipped we were talking about analyzing billions of events.
The reach of Netflix service was also expanding from a primarily US bound service to our plans to grow Internationally. This was an important factor for cloud adoption as we knew we need to operate the service closer to consumers to give them good streaming experience. Which could have led to opening our own data centers world wide and maintaining them.
And finally we just wanted to focus on improving our product and service and not get distracted by how to figure out storage, networking servers etc.
Probably the single biggest advantage of going into cloud.
Both seasonal or within a day.
Spot market lets you bid on your extra instances.
Kind of reverse to elasticity when you can reserve the instances for a period of time. But that’s not the case.
I might be sounding as if Cloud is one answer to all infrastructural and computing needs for an organization.
But before you jump and make that call there are a few things I would like to highlight when Cloud might not sound that viable.
So what as some of the considerations you should be aware of when selecting cloud where it may not make sense for your infrastructure.
1> Cost: I’m sure you would have heard the entire spectrum of thoughts on how cloud infrastructure are way costly or how you can save money by adopting cloud. I would say do not make cost as driving factor to your decision making process.
2>
3> Data Center Mindset : while it is possible to do so but forklifting your infrastructure to cloud is a recipe for failure. Your entire engineering team should have the mindset and skills to enhance the applications and code (and sometimes even re-write) to use cloud. Applications should be written to be resilient to hardware failures, they should be able to manage redundancy and be able to run highly parallel. From vertical scaling to now horizontal scale.
4> Companies who themselves run at very large scale (I’m talking about Googles/Facebooks) , so large that they are bigger than the platform provider, at that point it will make more sense to setup their own infrastructure.
5> And finally if your applications need specialized hardware which is so deeply embedded within their optimization paradigm, then it getting that configuration in cloud could be challenging.
Big Relational Databases definitely fall this category where their extremely high performances is finely tuned within their hardware setup and definitely one of the factor why it took them longer to provide cloud based services.
Which is great segway to move the presentation to talk about how Netflix solved this problem.
We had a simple requirement. To be able to run our size database in cloud without loosing out on functionality and performance
For Teradata and it was a challenge because specialized hardware/software is so well optimized. And that left us where much of processing happening in cloud and our Teradata with some pending applications still in datacenter. On top of it we were fast running out of capacity and requires an upgrade.
Given we were already a cloud we decided we wont invest further in our data center applications which means we pushed into living within our means in Teradata and started being very selective on what comes down into teradata. But clearly not a great situation to be in.
Redshift Announcement: early last year AWS came out with Big MPP database offering. It’s a good offering and kicked the tires. The problem was that we have a open production environment where we do pound our database hard with upwards of 70+ concurrent sessions requiring us to have visibility into workload and monitoring.
So we were mostly in waiting mode when Teradata came back to us with Cloud offering. We were initial quite skeptic about it but as we learn more about it became clear Teradata is taking cloud seriously and offer was quite compelling.
So one side we have a service which is much mature with their cloud offering and integration and on other side we have a mature database and effortless migration. And it is clearly a race and we all will benefit from it. I would like to see Redshift becoming more mature with database and on the same time I would love to see Teradata becoming more mature on cloud offering.
Self service
Elasticity
Pricing
Integration
Nothing new here. Everyone knows that. What I’m saying here is with the exponential growth of data, forecasting the infrastructure will continue to be more difficult which will result in continuous upgrades and pain it comes with. And we all know that Negotiating contracts and a purchase cycle is never a fun activity. Open pricing and self serve options to add resources just removes that all together
Second Take away is that Hybrid setup might just be the right solution.