Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
10MoreLessons
Learned from building real-life Machine Learning Systems
Xavier Amatriain (@xamat) 10/13/2015
Machine Learning
@Quora
Our Mission
“To share and grow the world’s knowledge”
● Millions of questions & answers
● Millions of users
● Thousands of...
Demand
What we care about
Quality
Relevance
Lots of data relations
ML Applications @ Quora
● Answer ranking
● Feed ranking
● Topic recommendations
● User recommendations
● Email digest
● As...
Models
● Logistic Regression
● Elastic Nets
● Gradient Boosted Decision
Trees
● Random Forests
● (Deep) Neural Networks
● ...
10MoreLessons
Learned from implementing real-life ML systems
1.Implicitsignalsbeat
explicitones
(almostalways)
Implicit vs. Explicit
● Many have acknowledged
that implicit feedback is more useful
● Is implicit feedback really always
...
● Implicit data is (usually):
○ More dense, and available for all users
○ Better representative of user behavior vs.
user ...
● However
○ It is not always the case that
direct implicit feedback correlates
well with long-term retention
○ E.g. clickb...
2.YourModelwilllearn
whatyouteachittolearn
Training a model
● Model will learn according to:
○ Training data (e.g. implicit and explicit)
○ Target function (e.g. pro...
Example 2 - Quora’s feed
● Training data = implicit + explicit
● Target function: Value of showing a story to a
user ~ wei...
3.Supervisedvs.plus
UnsupervisedLearning
Supervised/Unsupervised Learning
● Unsupervised learning as dimensionality reduction
● Unsupervised learning as feature en...
Supervised/Unsupervised Learning
● One of the “tricks” in Deep Learning is how it
combines unsupervised/supervised learnin...
4.Everythingisanensemble
Ensembles
● Netflix Prize was won by an ensemble
○ Initially Bellkor was using GDBTs
○ BigChaos introduced ANN-based ensem...
Ensembles & Feature Engineering
● Ensembles are the way to turn any model into a feature!
● E.g. Don’t know if the way to ...
The Master Algorithm?
It definitely is an ensemble!
5.Theoutputofyourmodel
willbetheinputofanotherone
(andotherdesignproblems)
Outputs will be inputs
● Ensembles turn any model into a feature
○ That’s great!
○ That can be a mess!
● Make sure the out...
ML vs Software
● Can you treat your ML infrastructure as you would
your software one?
○ Yes and No
● You should apply best...
6.Thepains&gains
ofFeatureEngineering
Feature Engineering
● Main properties of a well-behaved ML feature
○ Reusable
○ Transformable
○ Interpretable
○ Reliable
●...
Feature Engineering
● Main properties of a well-behaved ML feature
○ Reusable
○ Transformable
○ Interpretable
○ Reliable
●...
Feature Engineering Example - Quora Answer Ranking
What is a good Quora answer?
• truthful
• reusable
• provides explanati...
Feature Engineering Example - Quora Answer Ranking
How are those dimensions translated
into features?
• Features that rela...
7.Thetwofacesofyour
MLinfrastructure
Machine Learning Infrastructure
● Whenever you develop any ML infrastructure, you need to
target two different modes:
○ Mo...
Machine Learning Infrastructure: Experimentation & Production
● Option 1:
○ Favor experimentation and only invest in produ...
Machine Learning Infrastructure: Experimentation & Production
● Option 1:
○ Favor experimentation and only invest in produ...
● Good intermediate options:
○ Have ML “researchers” experiment on iPython Notebooks using
Python tools (scikit-learn, The...
8.Whyyoushouldcareabout
answeringquestions(aboutyourmodel)
Model debuggability
● Value of a model = value it brings to the product
● Product owners/stakeholders have expectations on...
Model debuggability
● E.g. Why am I seeing or not seeing
this on my homepage feed?
9.Youdon’tneedtodistribute
yourMLalgorithm
Distributing ML
● Most of what people do in practice can fit into a multi-
core machine
○ Smart data sampling
○ Offline sc...
Distributing ML
● Example of optimizing computations to fit them into
one machine
○ Spark implementation: 6 hours, 15 mach...
10.Theuntoldstoryof
DataScienceandvs.MLengineering
Data Scientists and ML Engineers
● We all know the definition of a Data Scientist
● Where do Data Scientists fit in an org...
The data-driven ML innovation funnel
Data Research
ML Exploration -
Product Design
AB Testing
Data Scientists and ML Engineers
● Solution:
○ (1) Define different parts of the innovation funnel
■ Part 1. Data research...
Conclusions
● Make sure you teach your model what you
want it to learn
● Ensembles and the combination of
supervised/unsupervised tech...
Machine Learning @Quora
Próxima SlideShare
Cargando en…5
×

Publicado el

Machine Learning
@Quora

Publicado en: Ingeniería, Tecnología
  • Profollica�'s all-natural formula helped 90% of men reduce hair loss in a clinical trial. ★★★ http://t.cn/AiHip2fH
       Responder 
    ¿Estás seguro?    No
    Tu mensaje aparecerá aquí
  • Memory Improvement: How To Improve Your Memory In Just 30 Days, click here.. ■■■ https://tinyurl.com/brainpill101
       Responder 
    ¿Estás seguro?    No
    Tu mensaje aparecerá aquí
  • 22lbs GONE in 13 days with this strange carb-pairing trick.. ♥♥♥ http://t.cn/AiYhcYmI
       Responder 
    ¿Estás seguro?    No
    Tu mensaje aparecerá aquí
  • Holistic clear skin Secrets, Eliminate blemishes in weeks acne cure e-book reveals all ➤➤ http://t.cn/AiWGkfAm
       Responder 
    ¿Estás seguro?    No
    Tu mensaje aparecerá aquí
  • New E-book Reveals Unique Holistic Strategies to Cure Acne. Discover How To Quickly And Easily Cure Acne Permanently...Even If Everything Else You Tried had Failed... Without Drugs, Without Over The Counters, and Without Nasty Side Effects - Guaranteed! ●●● http://t.cn/AiWGkfAm
       Responder 
    ¿Estás seguro?    No
    Tu mensaje aparecerá aquí

×