Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Data-Driven @ Netflix

Cargando en…3
×

Eche un vistazo a continuación

1 de 50 Anuncio
1 de 50 Anuncio

Data-Driven @ Netflix

Slides from Michelle Ufford's talk, Data-Driven @ Netflix. Talk given at PASS Summit 2016 in October 2016.

Netflix is the quintessential data-driven company. It’s 83 million members stream more than 125 million hours in over 190 countries every day and generate more than 700 billion events in the process. In this session, we’ll share how data is used to make informed decisions across the entire business — from content acquisition to content delivery, and everything in between. We’ll look at how Netflix successfully employs a scalable cloud-based data platform to support a constant deluge of data and a small army of data analysts, engineers, and scientists. We’ll discuss the advanced analytical capabilities that are enabled through modern data technologies. Lastly, we’ll explore some of the architectural & operational principals that enable Netflix to so effectively make use of its data.

Slides from Michelle Ufford's talk, Data-Driven @ Netflix. Talk given at PASS Summit 2016 in October 2016.

Netflix is the quintessential data-driven company. It’s 83 million members stream more than 125 million hours in over 190 countries every day and generate more than 700 billion events in the process. In this session, we’ll share how data is used to make informed decisions across the entire business — from content acquisition to content delivery, and everything in between. We’ll look at how Netflix successfully employs a scalable cloud-based data platform to support a constant deluge of data and a small army of data analysts, engineers, and scientists. We’ll discuss the advanced analytical capabilities that are enabled through modern data technologies. Lastly, we’ll explore some of the architectural & operational principals that enable Netflix to so effectively make use of its data.

Anuncio
Anuncio

Más Contenido Relacionado

Presentaciones para usted (19)

Similares a Data-Driven @ Netflix (20)

Anuncio

Data-Driven @ Netflix

  1. 1. Data-Driven @ Netflix Michelle Ufford Principal Architect Data Engineering & Analytics
  2. 2. Michelle Ufford Highlights ● Principal Architect at Netflix Data Engineering & Analytics ● Prev. Engineering Manager at GoDaddy Data Platform ● Microsoft Data Platform MVP ● 10+ years building web-scale analytics & data engineering infrastructure ● advises on Big Data topics Microsoft, Hortonworks, Teradata, etc. Gratuitous picture of my kids
  3. 3. By the Numbers.
  4. 4. The business numbers. 86.7 million members 1000+ devices supported 125+ million hours watched launched 19 years ago every. day.
  5. 5. Any device. Anywhere.* * Well, almost anywhere.
  6. 6. The data numbers. 4 petabyte DW reads 300 terabyte DW writes 40 petabyte data warehouse 700+ billion events written
  7. 7. Data in Action.
  8. 8. Content. What should we license?
  9. 9. Predicting Value for Licensed Content. Feature Engineering Predictive Models License Terms Content Efficiency
  10. 10. Predicting Value for Licensed Content. ● past performance of similar content on Netflix ● broadcast & Box Office performance ● talent (writers, actors, directors, etc.) ● critic & user reviews ● awards & accolades Feature Engineering Predictive Models License Terms Content Efficiency
  11. 11. Predicting Value for Licensed Content. Feature Engineering Predictive Models License Terms Content Efficiency
  12. 12. Predicting Value for Licensed Content. ● terms (length, exclusivity, etc.) ● bid amount ● negotiations Feature Engineering Predictive Models License Terms Content Efficiency
  13. 13. Predicting Value for Licensed Content. ● value / cost ● if efficient, license Feature Engineering Predictive Models License Terms Content Efficiency
  14. 14. “ last year our original content overall was some of our most efficient content. ”
  15. 15. “ We are building a studio in the cloud and pioneering new approaches to movie production, optimizing pitches, production schedules, subtitling, and digital asset management for our Original content. ”
  16. 16. What should we license?create?
  17. 17. Product UX. Data. Driven. Experience.
  18. 18. There are 86 million different versions of Netflix.
  19. 19. billboard rows, row order titles, title order title artwork
  20. 20. Public Relations. Analytics of news.Analytics is news.
  21. 21. Content Delivery. Monitoring a global service.
  22. 22. YouTube video of Vizceral demo https://youtu.be/JctsPpgEsVs
  23. 23. Behind the Scenes.
  24. 24. data access AWS S3 Big Data Platform Amazon Redshift data processing fast storage data viz METACA T data services events data operational data elastic storage Apache Pig
  25. 25. Philosophy.
  26. 26. Freedom & Responsibility.
  27. 27. Context, Not Control.
  28. 28. Highly Aligned & Loosely Coupled.
  29. 29. Big Data Platform
  30. 30. Data Engineering & Analytics MarketingProduct PlaybackContent Finance 105 talented engineers & analysts data viz engineers analytics engineers data engineers Big Data Platform analysts
  31. 31. Results, Not Opinions.
  32. 32. Experimentation Platform Batch & Ad Hoc Analysis
  33. 33. Questions?
  34. 34. Thank you for attending! Michelle Ufford linkedin.com/in/mufford @sqlfool Data @ Netflix @NetflixData hadoopsie.com techblog.netflix.com tinyurl.com/NetflixData

Notas del editor

  • Abstract:
    Netflix is the quintessential data-driven company. It’s 83 million members stream more than 125 million hours in over 190 countries every day and generate more than 700 billion events in the process. In this session, we’ll share how data is used to make informed decisions across the entire business — from content acquisition to content delivery, and everything in between. We’ll look at how Netflix successfully employs a scalable cloud-based data platform to support a constant deluge of data and a small army of data analysts, engineers, and scientists. We’ll discuss the advanced analytical capabilities that are enabled through modern data technologies. Lastly, we’ll explore some of the architectural & operational principals that enable Netflix to so effectively make use of its data.

  • Obligatory “why should you listen to me talk?” slide
  • Numbers as of Q3 2016
  • During CES 2016 this January, ‘flipped the switch’ making Netflix available in 130+ new countries. Netflix is presently available in over 190 countries worldwide.
  • What content should we license?
    How much should we bid?
    How should we value exclusivity?
    How should we measure content performance?
  • Originals content:
    2015 - 450 hours
    2016 - 600 hours
    2017 - 1000 hours
  • Netflix website: circa 2012
  • Netflix website: circa 2013
  • Netflix website: circa 2014
  • Netflix website: circa 2015
  • Netflix website: circa 2016
  • Vizceral Open-Source Project: https://github.com/netflix/vizceral
    http://techblog.netflix.com/2016/08/vizceral-open-source.html
    http://techblog.netflix.com/2015/10/flux-new-approach-to-system-intuition.html
  • Genie – federated job execution engine
    Metacat – federated metadata service
    Kragle – python APIs
  • 15m views on SlideShare
  • Minimize rules
    Make smart choices
    Take ownership

  • Avoid prescriptive requirements
    Give visibility
  • Set context (strategy, goals)
    Communicate only as much as needed

  • At Netflix, we use the scientific method
    We’re often right at predicting behavior – for people exactly like us
    Most people aren’t like us


×