SlideShare una empresa de Scribd logo
1 de 48
Descargar para leer sin conexión
Machine Learning At
Netflix Scale
Aish Fenton
Manager - Research Engineering
@aishfenton
Everything is a
recommendation
4
Top Picks for Aish
Movies based on books
Because you watched Bob’s Burgers
Rank based on your taste
Rankbasedonyourtaste
75% of plays come
from homepage
Back Story…
Proxy question:
▪ Accuracy in predicted rating
▪ Improve by 10% = $1million!
What we were interested in:
▪ High quality recommendations
predicted
actual
SVD RBMs
Top two results still used in production!
>
2006 2013
• > 44M members
• > 40 countries
• > 5B hours in Q3 2013
• Log 100B events/day
• 31.62% of peak US downstream traffic
Data and Models
▪ > 40M subscribers
▪ Ratings: ~5M/day
▪ Searches: >3M/day
▪ Plays: > 50M/day
▪ Streamed hours:
o 5B hours in Q3 2013
Geo Info
Time
Impressions
Device Info
Metadata
Social
Ratings
Demographics
Member Behavior
Plays
Aish House of Cards
Latent User Vector
Latent Item Vector
3.53
RU
M
u1 u2 u3
m1 !
m2!
m3
House of Cards
Aish Aish
House of Cards
Mean Rating My Bias
Movie Bias
Interaction
Mean Rating My Bias
Movie Bias
Interaction
3.55 = 2.50 + -1.5 + 1.2 + pq
My rating for
House of Cards
R
3.53
U
M
u1 u2 u3
m1 !
m2!
m3
House of Cards
Aish
2.35
1.34
Time
T
t1 t2 t3 Time
▪ Matrix/Tensor Factorization
▪ Regression models (Logistic, Linear, Elastic nets)
▪ Factorization Machines
▪ Restricted Boltzmann Machines
▪ Markov Chains & other graph models
▪ Clustering / Topic Models
▪ Neural Networks
▪ Association Rules
▪ GBDT/RF
▪ …
Popularity
+ Ratings
+ More Features & Optimized Models
0%
50%
100%
150%
200%
250%
300%
Improvement Over Baseline
Anatomy of a
Machine Learning
Platform
Problem
Data
Experiment
Offline
Produce
Model
Test /
Metrics
Near-line
Online
UI Clients
Event
Distribution
Online
Algs
Model
Trainer
Pre-
compute
AB Test
Metrics
API Layer
Monitoring
Offline
Hadoop / Data Warehouse
Experimentation
Platform
S3 / HDFS
Offline
Metrics
Query Tools
Models
Models
Near-line
Online
UI Clients
Event
Distribution
Online
Algs
Model
Trainer
Pre-
compute
AB Test
Metrics
API Layer
Monitoring
Offline
Hadoop / Data Warehouse
Experimentation
Platform
S3 / HDFS
Offline
Metrics
Query Tools
Models
Models
▪ App Logs
▪ User Actions
▪ Ratings
▪ Plays
▪ Queue Adds
▪ Algo Actions
▪ Impressions (Presentation Bias)
▪ Context
▪ Device Info
▪ User Demographics
▪ Social
▪ Time
▪ …
Many different types of data…
Near-line
Online
UI Clients
Event
Distribution
Online
Algs
Model
Trainer
Pre-
compute
AB Test
Metrics
API Layer
Monitoring
Offline
Hadoop / Data Warehouse
Experimentation
Platform
S3 / HDFS
Offline
Metrics
Query Tools
Models
Models
Embedded
Embedded
Weights
Real-time popularity of movie
Example: Neural Network Training
θ
Input OutputHidden Layer
Input OutputHidden Layers
Neural Network Training
1,536 cores
G2 Instances
$0.60 p/h
But… things can go astray
Near-line
Online
UI Clients
Event
Distribution
Online
Algs
Model
Trainer
Pre-
compute
AB Test
Metrics
API Layer
Monitoring
Offline
Hadoop / Data Warehouse
Experimentation
Platform
S3 / HDFS
Offline
Metrics
Query Tools
Models
Models
RU
M
Pre-compute
u1 u2 u3Online
Near-line
Online
UI Clients
Event
Distribution
Online
Algs
Model
Trainer
Pre-
compute
AB Test
Metrics
API Layer
Monitoring
Offline
Hadoop / Data Warehouse
Experimentation
Platform
S3 / HDFS
Offline
Metrics
Query Tools
Models
Models
Aish played HoC
Publish new model
for Aish
Aish Fenton
@aishfenton
https://www.linkedin.com/profile/view?id=47917219

Más contenido relacionado

La actualidad más candente

Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020Julien Le Dem
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsYves Raimond
 
Contextualization at Netflix
Contextualization at NetflixContextualization at Netflix
Contextualization at NetflixLinas Baltrunas
 
Shallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender SystemShallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender SystemAnoop Deoras
 
Artwork Personalization at Netflix
Artwork Personalization at NetflixArtwork Personalization at Netflix
Artwork Personalization at NetflixJustin Basilico
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsJustin Basilico
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareJustin Basilico
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableJustin Basilico
 
Netflix Recommendations - Beyond the 5 Stars
Netflix Recommendations - Beyond the 5 StarsNetflix Recommendations - Beyond the 5 Stars
Netflix Recommendations - Beyond the 5 StarsXavier Amatriain
 
Context Aware Recommendations at Netflix
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at NetflixLinas Baltrunas
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation SystemsTrieu Nguyen
 
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020Zachary Schendel
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsJaya Kawale
 
Machine learning for Netflix recommendations talk at SF Make School
Machine learning for Netflix recommendations talk at SF Make SchoolMachine learning for Netflix recommendations talk at SF Make School
Machine learning for Netflix recommendations talk at SF Make SchoolFaisal Siddiqi
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the WorldYves Raimond
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyChris Johnson
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated RecommendationsHarald Steck
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixJustin Basilico
 
Introduction to Artificial Intelligence on AWS
Introduction to Artificial Intelligence on AWSIntroduction to Artificial Intelligence on AWS
Introduction to Artificial Intelligence on AWSAmazon Web Services
 

La actualidad más candente (20)

Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Contextualization at Netflix
Contextualization at NetflixContextualization at Netflix
Contextualization at Netflix
 
Shallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender SystemShallow and Deep Latent Models for Recommender System
Shallow and Deep Latent Models for Recommender System
 
Artwork Personalization at Netflix
Artwork Personalization at NetflixArtwork Personalization at Netflix
Artwork Personalization at Netflix
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Learning to Personalize
Learning to PersonalizeLearning to Personalize
Learning to Personalize
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Netflix Recommendations - Beyond the 5 Stars
Netflix Recommendations - Beyond the 5 StarsNetflix Recommendations - Beyond the 5 Stars
Netflix Recommendations - Beyond the 5 Stars
 
Context Aware Recommendations at Netflix
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at Netflix
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
 
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
 
Machine learning for Netflix recommendations talk at SF Make School
Machine learning for Netflix recommendations talk at SF Make SchoolMachine learning for Netflix recommendations talk at SF Make School
Machine learning for Netflix recommendations talk at SF Make School
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the World
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Introduction to Artificial Intelligence on AWS
Introduction to Artificial Intelligence on AWSIntroduction to Artificial Intelligence on AWS
Introduction to Artificial Intelligence on AWS
 

Destacado

Machine Learning at Netflix
Machine Learning at NetflixMachine Learning at Netflix
Machine Learning at NetflixDomino Data Lab
 
ARTIFICIAL INTELLIGENCE AT WORK
ARTIFICIAL INTELLIGENCE AT WORKARTIFICIAL INTELLIGENCE AT WORK
ARTIFICIAL INTELLIGENCE AT WORKEnrico Busto
 
REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS BigDataCloud
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleXavier Amatriain
 
Survey of Recommendation Systems
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systemsyoualab
 
Personalization - 10 Lessons Learned from Netflix
Personalization - 10 Lessons Learned from NetflixPersonalization - 10 Lessons Learned from Netflix
Personalization - 10 Lessons Learned from NetflixPancrazio Auteri
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning SystemsXavier Amatriain
 
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...Spark Summit
 

Destacado (8)

Machine Learning at Netflix
Machine Learning at NetflixMachine Learning at Netflix
Machine Learning at Netflix
 
ARTIFICIAL INTELLIGENCE AT WORK
ARTIFICIAL INTELLIGENCE AT WORKARTIFICIAL INTELLIGENCE AT WORK
ARTIFICIAL INTELLIGENCE AT WORK
 
REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
 
Survey of Recommendation Systems
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systems
 
Personalization - 10 Lessons Learned from Netflix
Personalization - 10 Lessons Learned from NetflixPersonalization - 10 Lessons Learned from Netflix
Personalization - 10 Lessons Learned from Netflix
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
 

Similar a Machine Learning at Netflix Scale

Netflix Recommendations Feature Engineering with Time Travel
Netflix Recommendations Feature Engineering with Time TravelNetflix Recommendations Feature Engineering with Time Travel
Netflix Recommendations Feature Engineering with Time TravelFaisal Siddiqi
 
Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015StampedeCon
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use CasesMax De Marzi
 
Darin Briskman_Amazon_June_9_2017_Presentation
Darin Briskman_Amazon_June_9_2017_PresentationDarin Briskman_Amazon_June_9_2017_Presentation
Darin Briskman_Amazon_June_9_2017_PresentationTriNimbus
 
(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014
(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014
(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014Amazon Web Services
 
Análisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic StackAnálisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic StackElasticsearch
 
Elastic Stack roadmap deep dive
Elastic Stack roadmap deep diveElastic Stack roadmap deep dive
Elastic Stack roadmap deep diveElasticsearch
 
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...DataStax
 
Netflix Recommender System : Big Data Case Study
Netflix Recommender System : Big Data Case StudyNetflix Recommender System : Big Data Case Study
Netflix Recommender System : Big Data Case StudyKetan Patil
 
Análisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic StackAnálisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic StackElasticsearch
 
Ordering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect dataOrdering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect dataAndy Stretton
 
An Approach to Data Quality for Netflix Personalization Systems
An Approach to Data Quality for Netflix Personalization SystemsAn Approach to Data Quality for Netflix Personalization Systems
An Approach to Data Quality for Netflix Personalization SystemsDatabricks
 
Bootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jBootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jMax De Marzi
 
Analytics, reporting and ROI, Presentation EnDigiCom LTTA 1 by Jasna Suhadolc...
Analytics, reporting and ROI, Presentation EnDigiCom LTTA 1 by Jasna Suhadolc...Analytics, reporting and ROI, Presentation EnDigiCom LTTA 1 by Jasna Suhadolc...
Analytics, reporting and ROI, Presentation EnDigiCom LTTA 1 by Jasna Suhadolc...EnDigiCom
 
Scaling the Netflix API - From Atlassian Dev Den
Scaling the Netflix API - From Atlassian Dev DenScaling the Netflix API - From Atlassian Dev Den
Scaling the Netflix API - From Atlassian Dev DenDaniel Jacobson
 
Distributed Time Travel for Feature Generation by DB Tsai and Prasanna Padman...
Distributed Time Travel for Feature Generation by DB Tsai and Prasanna Padman...Distributed Time Travel for Feature Generation by DB Tsai and Prasanna Padman...
Distributed Time Travel for Feature Generation by DB Tsai and Prasanna Padman...Spark Summit
 
Graphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in ProductionGraphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in ProductionNeo4j
 
Perfect Norikra 2nd Season
Perfect Norikra 2nd SeasonPerfect Norikra 2nd Season
Perfect Norikra 2nd SeasonSATOSHI TAGOMORI
 

Similar a Machine Learning at Netflix Scale (20)

Netflix Recommendations Feature Engineering with Time Travel
Netflix Recommendations Feature Engineering with Time TravelNetflix Recommendations Feature Engineering with Time Travel
Netflix Recommendations Feature Engineering with Time Travel
 
Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 
Darin Briskman_Amazon_June_9_2017_Presentation
Darin Briskman_Amazon_June_9_2017_PresentationDarin Briskman_Amazon_June_9_2017_Presentation
Darin Briskman_Amazon_June_9_2017_Presentation
 
(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014
(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014
(ARC303) Panning for Gold: Analyzing Unstructured Data | AWS re:Invent 2014
 
Análisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic StackAnálisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic Stack
 
Elastic Stack roadmap deep dive
Elastic Stack roadmap deep diveElastic Stack roadmap deep dive
Elastic Stack roadmap deep dive
 
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
 
Netflix Recommender System : Big Data Case Study
Netflix Recommender System : Big Data Case StudyNetflix Recommender System : Big Data Case Study
Netflix Recommender System : Big Data Case Study
 
Data Science At Zillow
Data Science At ZillowData Science At Zillow
Data Science At Zillow
 
Data Access Patterns
Data Access PatternsData Access Patterns
Data Access Patterns
 
Análisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic StackAnálisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic Stack
 
Ordering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect dataOrdering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect data
 
An Approach to Data Quality for Netflix Personalization Systems
An Approach to Data Quality for Netflix Personalization SystemsAn Approach to Data Quality for Netflix Personalization Systems
An Approach to Data Quality for Netflix Personalization Systems
 
Bootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jBootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4j
 
Analytics, reporting and ROI, Presentation EnDigiCom LTTA 1 by Jasna Suhadolc...
Analytics, reporting and ROI, Presentation EnDigiCom LTTA 1 by Jasna Suhadolc...Analytics, reporting and ROI, Presentation EnDigiCom LTTA 1 by Jasna Suhadolc...
Analytics, reporting and ROI, Presentation EnDigiCom LTTA 1 by Jasna Suhadolc...
 
Scaling the Netflix API - From Atlassian Dev Den
Scaling the Netflix API - From Atlassian Dev DenScaling the Netflix API - From Atlassian Dev Den
Scaling the Netflix API - From Atlassian Dev Den
 
Distributed Time Travel for Feature Generation by DB Tsai and Prasanna Padman...
Distributed Time Travel for Feature Generation by DB Tsai and Prasanna Padman...Distributed Time Travel for Feature Generation by DB Tsai and Prasanna Padman...
Distributed Time Travel for Feature Generation by DB Tsai and Prasanna Padman...
 
Graphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in ProductionGraphs in Action: In-depth look at Neo4j in Production
Graphs in Action: In-depth look at Neo4j in Production
 
Perfect Norikra 2nd Season
Perfect Norikra 2nd SeasonPerfect Norikra 2nd Season
Perfect Norikra 2nd Season
 

Último

VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 

Último (20)

VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 

Machine Learning at Netflix Scale

Notas del editor

  1. - Who in the audience has an ML background ? Who is has big data background? Who’s an engineer? Going to cover: Bit of everything. A few models, our approach to architecture of ML systems, and how it all comes together Feel free to ask questions as we go along.
  2. - We use Machine Learning in many places at Netflix, but perhaps the place we’re best known for ML is in our recommender systems, and our personalization - So wanted to start with quick overview of what is personalization in Netflix
  3. If you’ve logged into Netflix before this should look familiar. This is what it looks like when you login to our website What you might not realize however is that almost every element on this page is driven by a ML algorithm
  4. - There’s the obvious recommendations. We a row of explicit recommendations, where we pull together everything we know about you, and present our “top picks” for you
  5. You’ll also see “Genre” rows, that provide shows around a particular theme. Movies are tagged in our system based on a number of different aspects The tags are editorially added by our team of content experts Which genre’s we pick however, is personalized. So “Movies based on books” is shown for me based on my predicted likelihood of wanting to watch this genre There’s also a level of personalization within the row itself. So a genre like “Movies based on books” spans a lot of different tastes. For example, movies about Wall Street and documentation on the GFC, and Young Adult Fiction all types of “Movies based on Books”, but they serve different tastes. But based on what we know about you, we can construct a set of “Movies based on books” tailed to your particular view of what that means.
  6. We also do “Similar” rows. So as the title says, because I last watched Bob Burgers, here’s some choices that are similar to that.
  7. Even our marketing images are personalized. Much of the hero images and marketing you see within Netflix is personalized to your taste. I see OITNB, but here because it fits with my tastes
  8. Finally we put it all together. Unsurprisingly, most of what people play is from the top left hand corner, and if they are forced to scroll further down, or right, then that means we failed to predict what they want to watch So we also rank the entire page. I’ve already shown how we rank the different rows left-to-right. We also rank each row top-to-bottom, so that you the most relevant (for you) rows are pushed to the top of the page.
  9. The net result of this personalization, is that 75% of what our users watch, is selected from the homepage. And the rows I’ve just shown you. Which means that we’ve been able to provide a very personalized experience for our users, where what they see on the homepage, when they login to Netflix, matches pretty well with what they want to watch.
  10. - Okay, I’m going to take a minute now to provide some back story.
  11. Who’s heard of the Netflix prize? It ran from 2006->2009. - It was won in 2009 by Team KorBell (AT&T).
  12. The challenge was: We give you 100M anonymized ratings from users data, to build a “rating prediction” model with We then get you to predict 2.8M ratings for user’s who we already know what they rated, but we held back. If you can improve on our predicted ratings by 10%, then we give you 1 million dollars We measure this as the root mean square difference between, your predicted rating, and what the real rating is that we held back. - Team KorBell (AT&T) won it in 2009. - They improved the predictions by 8.43% http://mathurl.com/osuomvj
  13. Two significant algorithms came out of the Netflix Prize. SVD - Prize RMSE: 0.8914 RBM - Prize RMSE: 0.8990 They were known in academia already, but hadn’t made their way out into industry recommender systems. I talk through how SVD works at a high level in later slides These two algorithms are still used in parts of the Netflix Recommender System to this day.
  14. - There are limitations though. Ratings != Plays. People’s ratings are somewhat “aspirational”. People may rate CitzenKane 5 stars, but what they watch is Sharknado. For our use case, we’re interested in predicting what people actually want to watch, not predicting what they think are critically worthy movies.
  15. Also Netflix has changed a lot since the start of the Netflix Prize. In 2006 we were mailing out DVDs. Now we’re more about steaming to devices. This also changed people viewing habits. The investment in selecting a great DVD, that the entire family can watch, was higher. Everyone had to agree on it, and getting it wrong might ruin your night. With streaming content want content that is more personalized, and more context sensitive to what they want to watch NOW.
  16. Also Netflix has grown. A lot. What algorithms worked in 2006, don’t necessary work with the volume we now have
  17. - Okay so dive a little into the models and data we use to do our personalization
  18. On the data side we have have a lot to work with. There’s a lot of signal that we get beyond straight plays/ratings. If you think about it, the context in which someone chooses what to watch tells you a lot too.
  19. So I want to give you a quick overview of how SVD (aka Matrix Factorization) works. This is one of the classic algorithms used in the NF prize, and was a big break through at the time. This should give you a flavor of how these systems work. Basic model is. http://mathurl.com/pgux65w
  20. - To make that more visual
  21. http://mathurl.com/pgux65w
  22. http://mathurl.com/l4w5yd6
  23. http://mathurl.com/l4w5yd6
  24. So that’s one of the foundational algorithms used in recommender systems. But things have moved on a lot since then too. These days we’re mostly focused on ranking rather than rating prediction. This allows us to balance things like diversity, freshness, global popularity against our prediction on how much this fits your tastes We are (or have) AB tested many of these. And what algorithm to use really depends on your application, and what you’re trying to achieve. All have pros and cons. You’ll likely end up with a few different algorithms for different parts of the problem The important thing to test them in your production system
  25. Over time we been able to improve on the results we got from the Netflix prize. It’s been a combination of adding more data, and adding in more sophisticated models As you can see here, we’ve moved things on a lot. These are improvements to Netflix’s core business metrics. So even a 1% improvement equates to real benefits to the business One quick note: Always make sure you select a realistic baseline to test against. Just straight global popularity is usually pretty tough to beat. So you can fool yourself if you’re not testing against that, or your equivalent of that.
  26. - So you now you have an idea of what a recommender system algorithm looks like, lets see how you can productionize that
  27. So here’s the core workflow you’ll need to support. Whatever decisions you make about your architecture, you’ll need to make the above process seamless. Machine Learning Approach Define problem (what you think needs solving, or hypothesis of what can be improved Gather data on which to train model Experiment offline to see if you can improve over baseline Produce Model/Algorithm and deploy Track key metrics in production to see if hypothesis is proven
  28. - Here’s a blueprint for different layers you’ll need. - We’ll step through each area next.
  29. Okay lets start with the front-end (aka online). I won’t cover much here, except for to point out that you’ll need an extremely good data pipeline. You’ll spend 90% of your time building this. Often needs to be built by an engineering team in collaboration with your researches.
  30. There’s many different types of data you’ll want to capture Incl. What your algorithms are doing. You’ll need to correct for presentation bias And context and behavior that users interact with you in
  31. - Need backend service that can accept and aggregate all these disparate data sources Want to look at technologies like Suro, Kafka, etc Stream to longer term (cheap) storage (S3, HDFS)
  32. Need common framework that makes it easier to instrument your code for events. Adopt early and get into every app as “standard”
  33. Okay lets talk about where you (typically) define and train your models Most of your models will be produced offline & embedded in production You’ll need a platform that allows easy, across diverse tools: R, iPython, in-house Common Format (can be code) that allows you to embed models once learned
  34. Common confusion: Models change less than you think Values you’ll be plugging in, can still be real-time http://mathurl.com/kuxa5hw
  35. Lets walk through an example of a model we train. Neural Networks
  36. These days use GPUs (Cuda) to do training of network. Thousand of cores Massively parallel Computing power is what’s changed. ANN are really an old idea
  37. But still need to explore hyper-parameter space. Parameters Learning rate theta …
  38. Parameters How many layers, and how deep
  39. AWS offers GPU compute instances. Approach. Conduct search over many different architectures / parameters - Distribute different architecture to each instance - Train model - Evaluate Can get smarter with how you explore this space. So rather than doing grid search, you search in areas most likely to have improvement 60cents an hour. Comparative fortune compared to other instances, but only takes a few hours to train model that is used in production for weeks (or months) Perfect for experimental work
  40. Your offline models won’t reflect sudden changes in behavior, that it hasn’t seen before. Here’s OITNB, and House of Cards (as being searched for in Google). These can represent massive shifts in global user behavior, which can throw the model off Also some models degrade faster than others. You see this especially with tree models.
  41. Another problem: The models themselves still run in production (even though they’re trained offline). This limits how sophisticated you can make your models. They still need to return results within your SLAs.
  42. One Solution. Near-line computing. Re-train models based on events from the system Pre-compute results where you can
  43. Now you don’t always have to pre-compute the final results. The beauty of the near-line approach is that it lets you half-bake the model. So that the parts that are more static are pre-generated, and the parts that are more sensitive to changes get worked on the fly. Remember our SVD model. U is users, M is movies, and R are ratings - Turns out that solving U if you know M and R, is simple Least Squares solution. With modern linear algebra libraries we can compute that in milli-seconds.
  44. Recomputes are event driven. No need to re-compute if nothing has changed So in this example, we re-compute the latent vectors representing my tastes, whenever there’s more information available about me to re-train that vector with.