SlideShare una empresa de Scribd logo
1 de 33
Recruiting Solutions 
1 
My Three Ex’s: 
A Data Science Approach for 
Applied Machine Learning
Dedicated to 3 of my favorite ex co-workers.
First, a disclosure. 
This isn’t a talk about machine learning. 
It’s a talk about applying machine learning. 
What’s the difference? 
3
Let’s talk about something else for a moment. 
Hash Tables 
4
What you (need to) know about hash tables. 
Theory Application 
5 
Class HashMap<K,V> 
java.lang.Object 
java.util.AbstractMap<K,V> 
java.util.HashMap<K,V> 
Type Parameters: 
K - the type of keys maintained by this map 
V - the type of mapped values 
All Implemented Interfaces: 
Serializable, Cloneable, Map<K,V>
Now let’s get back to machine learning! 
6
Please allow me to introduce my three ex’s. 
Express. 
Explain. 
Experiment. 
7
Embrace the data science mindset. 
Express 
Understand your utility and inputs. 
Explain 
Understand your models and metrics. 
Experiment 
Optimize for the speed of learning. 
8
Express. 
9
How to train your machine learning model. 
1. Define your objective function. 
2. Collect training data. 
3. Build models. 
4. Profit! 
10
You can only improve what you measure. 
11 
Clicks? 
Actions? 
Outcomes?
Be careful how you define precision… 
12
Account for non-uniform inputs and costs. 
13
Stratified sampling is your friend. 
14
An example of segmenting models. 
15 
Searcher: Recruiter 
Query: Person Name 
Searcher: Job Seeker 
Query: Person Name 
Searcher: Recruiter 
Query: Job Title 
Searcher: Job Seeker 
Query: Job Title
Express yourself in your feature vectors. 
16
Express: Summary. 
 Choose an objective function that models utility. 
 Be careful how you define precision. 
 Account for non-uniform inputs and costs. 
 Stratified sampling is your friend. 
 Express yourself in your feature vectors. 
17
Explain. 
18
With apologies to the little prince. 
19
Everyone is talking about Deep Learning. 
20
But accuracy isn’t everything. 
21
Explainable models, explainable features. 
 Less is more when it comes to explainability. 
 Algorithms can protect you from overfitting, but they can’t 
protect you from the biases you introduce. 
 Introspection into your models and features makes it 
easier for you and others to debug them. 
 Especially if you don’t completely trust your objective 
function or the representativeness of your training data. 
22
Linear regression? Decision trees? 
 Linear regression and decision trees favor explainability 
over accuracy, compared to more sophisticated models. 
 But size matters. If you have too many features or too 
deep a decision tree, you lose explainability. 
 You can always upgrade to a more sophisticated model 
when you trust your objective function and training data. 
 Build a machine learning model is an iterative process. 
Optimize for the speed of your own learning. 
23
Explain: Summary. 
 Accuracy isn’t everything. 
 Less is more when it comes to explainability. 
 Don’t knock linear models and decision trees! 
 Start with simple models, then upgrade. 
24
Experiment. 
25
Why experiments matter. 
“You have to kiss a lot of frogs to find one prince. 
So how can you find your prince faster? 
By finding more frogs and kissing them faster and 
faster.” 
-- Mike Moran 
26
Life in the age of big data. 
Yesterday Today 
27 
Experiments are expensive, 
choose hypotheses wisely. 
Experiments are cheap, 
do as many as you can!
So should we just test everything? 
28
Optimize for the speed of learning. 
29 
vs
Be disciplined: test one variable at a time. 
• Autocomplete 
• Entity Tagging 
• Vertical Intent 
• # of Suggestions 
• Suggestion Order 
• Language 
• Query Construction 
• Ranking Model 
30
Experiment: Summary. 
 Kiss lots of frogs: experiments are cheap. 
 But test in good faith – don’t just flip coins. 
 Optimize for the speed of learning. 
 Be disciplined: test one variable at a time. 
31
Bringing it all together. 
Express 
Understand your utility and inputs. 
Explain 
Understand your models and metrics. 
Experiment 
Optimize for the speed of learning. 
32
33 
Daniel Tunkelang 
dtunkelang@linkedin.com 
https://linkedin.com/in/dtunkelang

Más contenido relacionado

La actualidad más candente

The path to be a data scientist
The path to be a data scientistThe path to be a data scientist
The path to be a data scientistPoo Kuan Hoong
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learningKoundinya Desiraju
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningTamir Taha
 
Machine Learning Landscape
Machine Learning LandscapeMachine Learning Landscape
Machine Learning LandscapeEng Teong Cheah
 
Probabilistic programming
Probabilistic programmingProbabilistic programming
Probabilistic programmingEli Gottlieb
 
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Dhiana Deva
 
Claudia Gold: Learning Data Science Online
Claudia Gold: Learning Data Science OnlineClaudia Gold: Learning Data Science Online
Claudia Gold: Learning Data Science Onlinesfdatascience
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceNiko Vuokko
 
Probabilistic Programming in Python
Probabilistic Programming in PythonProbabilistic Programming in Python
Probabilistic Programming in PythonPeadar Coyle
 
Qualtrics and MaxDiff Analysis: Understanding True Customer Preference Rankings
Qualtrics and MaxDiff Analysis: Understanding True Customer Preference RankingsQualtrics and MaxDiff Analysis: Understanding True Customer Preference Rankings
Qualtrics and MaxDiff Analysis: Understanding True Customer Preference RankingsQualtrics
 
How to Become a Data Scientist
How to Become a Data ScientistHow to Become a Data Scientist
How to Become a Data Scientistryanorban
 
Data Science Methodology for Analytics and Solution Implementation
Data Science Methodology for Analytics and Solution ImplementationData Science Methodology for Analytics and Solution Implementation
Data Science Methodology for Analytics and Solution ImplementationRupak Roy
 
Writing Smarter Applications with Machine Learning
Writing Smarter Applications with Machine LearningWriting Smarter Applications with Machine Learning
Writing Smarter Applications with Machine LearningAnoop Thomas Mathew
 
MLSEV Virtual. Evaluations
MLSEV Virtual. EvaluationsMLSEV Virtual. Evaluations
MLSEV Virtual. EvaluationsBigML, Inc
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data ScientistDaniel Tunkelang
 
MLSEV Virtual. Supervised vs Unsupervised
MLSEV Virtual. Supervised vs UnsupervisedMLSEV Virtual. Supervised vs Unsupervised
MLSEV Virtual. Supervised vs UnsupervisedBigML, Inc
 
Clare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science OnlineClare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science Onlinesfdatascience
 
How to become a data scientist
How to become a data scientistHow to become a data scientist
How to become a data scientistDeZyre
 

La actualidad más candente (20)

The path to be a data scientist
The path to be a data scientistThe path to be a data scientist
The path to be a data scientist
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Jsm big-data
Jsm big-dataJsm big-data
Jsm big-data
 
Machine Learning Landscape
Machine Learning LandscapeMachine Learning Landscape
Machine Learning Landscape
 
Probabilistic programming
Probabilistic programmingProbabilistic programming
Probabilistic programming
 
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
 
Claudia Gold: Learning Data Science Online
Claudia Gold: Learning Data Science OnlineClaudia Gold: Learning Data Science Online
Claudia Gold: Learning Data Science Online
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Probabilistic Programming in Python
Probabilistic Programming in PythonProbabilistic Programming in Python
Probabilistic Programming in Python
 
Qualtrics and MaxDiff Analysis: Understanding True Customer Preference Rankings
Qualtrics and MaxDiff Analysis: Understanding True Customer Preference RankingsQualtrics and MaxDiff Analysis: Understanding True Customer Preference Rankings
Qualtrics and MaxDiff Analysis: Understanding True Customer Preference Rankings
 
How to Become a Data Scientist
How to Become a Data ScientistHow to Become a Data Scientist
How to Become a Data Scientist
 
Guide: MaxDiff
Guide: MaxDiffGuide: MaxDiff
Guide: MaxDiff
 
Data Science Methodology for Analytics and Solution Implementation
Data Science Methodology for Analytics and Solution ImplementationData Science Methodology for Analytics and Solution Implementation
Data Science Methodology for Analytics and Solution Implementation
 
Writing Smarter Applications with Machine Learning
Writing Smarter Applications with Machine LearningWriting Smarter Applications with Machine Learning
Writing Smarter Applications with Machine Learning
 
MLSEV Virtual. Evaluations
MLSEV Virtual. EvaluationsMLSEV Virtual. Evaluations
MLSEV Virtual. Evaluations
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data Scientist
 
MLSEV Virtual. Supervised vs Unsupervised
MLSEV Virtual. Supervised vs UnsupervisedMLSEV Virtual. Supervised vs Unsupervised
MLSEV Virtual. Supervised vs Unsupervised
 
Clare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science OnlineClare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science Online
 
How to become a data scientist
How to become a data scientistHow to become a data scientist
How to become a data scientist
 

Similar a My Three Ex’s: A Data Science Approach for Applied Machine Learning

ML crash course
ML crash courseML crash course
ML crash coursemikaelhuss
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Alok Singh
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basicsNeeleEilers
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsXavier Amatriain
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018HJ van Veen
 
National STEM League - Student Goals and Academic Glue
National STEM League - Student Goals and Academic GlueNational STEM League - Student Goals and Academic Glue
National STEM League - Student Goals and Academic GlueNAFCareerAcads
 
Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9Roger Barga
 
Human in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIHuman in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIPramit Choudhary
 
Machine learning para tertulianos, by javier ramirez at teowaki
Machine learning para tertulianos, by javier ramirez at teowakiMachine learning para tertulianos, by javier ramirez at teowaki
Machine learning para tertulianos, by javier ramirez at teowakijavier ramirez
 
Andrew NG machine learning
Andrew NG machine learningAndrew NG machine learning
Andrew NG machine learningShareDocView.com
 
The importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systemsThe importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systemsFrancesca Lazzeri, PhD
 
Data Science Accelerator Program
Data Science Accelerator ProgramData Science Accelerator Program
Data Science Accelerator ProgramGoDataDriven
 
Learning to Learn Model Behavior ( Capital One: data intelligence conference )
Learning to Learn Model Behavior ( Capital One: data intelligence conference )Learning to Learn Model Behavior ( Capital One: data intelligence conference )
Learning to Learn Model Behavior ( Capital One: data intelligence conference )Pramit Choudhary
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningAkshay Kanchan
 
An introduction to machine learning and statistics
An introduction to machine learning and statisticsAn introduction to machine learning and statistics
An introduction to machine learning and statisticsSpotle.ai
 
Machine Learning and Deep Learning from Foundations to Applications Excel, R,...
Machine Learning and Deep Learning from Foundations to Applications Excel, R,...Machine Learning and Deep Learning from Foundations to Applications Excel, R,...
Machine Learning and Deep Learning from Foundations to Applications Excel, R,...Narendra Ashar
 
Keepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech
 
9.b-CMPS 403-F20-Session 9-Intro to ML II.pdf
9.b-CMPS 403-F20-Session 9-Intro to ML II.pdf9.b-CMPS 403-F20-Session 9-Intro to ML II.pdf
9.b-CMPS 403-F20-Session 9-Intro to ML II.pdfAmirMohamedNabilSale
 

Similar a My Three Ex’s: A Data Science Approach for Applied Machine Learning (20)

ML crash course
ML crash courseML crash course
ML crash course
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basics
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systems
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
lec1.ppt
lec1.pptlec1.ppt
lec1.ppt
 
National STEM League - Student Goals and Academic Glue
National STEM League - Student Goals and Academic GlueNational STEM League - Student Goals and Academic Glue
National STEM League - Student Goals and Academic Glue
 
Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9
 
Human in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIHuman in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AI
 
Machine learning para tertulianos, by javier ramirez at teowaki
Machine learning para tertulianos, by javier ramirez at teowakiMachine learning para tertulianos, by javier ramirez at teowaki
Machine learning para tertulianos, by javier ramirez at teowaki
 
Andrew NG machine learning
Andrew NG machine learningAndrew NG machine learning
Andrew NG machine learning
 
The importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systemsThe importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systems
 
Data Science Accelerator Program
Data Science Accelerator ProgramData Science Accelerator Program
Data Science Accelerator Program
 
Learning to Learn Model Behavior ( Capital One: data intelligence conference )
Learning to Learn Model Behavior ( Capital One: data intelligence conference )Learning to Learn Model Behavior ( Capital One: data intelligence conference )
Learning to Learn Model Behavior ( Capital One: data intelligence conference )
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
An introduction to machine learning and statistics
An introduction to machine learning and statisticsAn introduction to machine learning and statistics
An introduction to machine learning and statistics
 
Machine Learning and Deep Learning from Foundations to Applications Excel, R,...
Machine Learning and Deep Learning from Foundations to Applications Excel, R,...Machine Learning and Deep Learning from Foundations to Applications Excel, R,...
Machine Learning and Deep Learning from Foundations to Applications Excel, R,...
 
Keepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivos
 
9.b-CMPS 403-F20-Session 9-Intro to ML II.pdf
9.b-CMPS 403-F20-Session 9-Intro to ML II.pdf9.b-CMPS 403-F20-Session 9-Intro to ML II.pdf
9.b-CMPS 403-F20-Session 9-Intro to ML II.pdf
 

Más de Daniel Tunkelang

Query Understanding and Ecommerce
Query Understanding and EcommerceQuery Understanding and Ecommerce
Query Understanding and EcommerceDaniel Tunkelang
 
Semantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce QueriesSemantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce QueriesDaniel Tunkelang
 
Helping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query UnderstandingHelping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query UnderstandingDaniel Tunkelang
 
Query Understanding: A Manifesto
Query Understanding: A ManifestoQuery Understanding: A Manifesto
Query Understanding: A ManifestoDaniel Tunkelang
 
Where should you put your data scientists?
Where should you put your data scientists?Where should you put your data scientists?
Where should you put your data scientists?Daniel Tunkelang
 
Better Search Through Query Understanding
Better Search Through Query UnderstandingBetter Search Through Query Understanding
Better Search Through Query UnderstandingDaniel Tunkelang
 
Find and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedInFind and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedInDaniel Tunkelang
 
Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Daniel Tunkelang
 
Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Daniel Tunkelang
 
Information, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of NeedsInformation, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of NeedsDaniel Tunkelang
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The PeopleDaniel Tunkelang
 
Content, Connections, and Context
Content, Connections, and ContextContent, Connections, and Context
Content, Connections, and ContextDaniel Tunkelang
 
Scale, Structure, and Semantics
Scale, Structure, and SemanticsScale, Structure, and Semantics
Scale, Structure, and SemanticsDaniel Tunkelang
 
Strata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of MicroworkStrata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of MicroworkDaniel Tunkelang
 
Recommendations as a Conversation with the User
Recommendations as a Conversation with the UserRecommendations as a Conversation with the User
Recommendations as a Conversation with the UserDaniel Tunkelang
 
Keeping It Professional: Relevance, Recommendations, and Reputation at LinkedIn
Keeping It Professional: Relevance, Recommendations, and Reputation at LinkedInKeeping It Professional: Relevance, Recommendations, and Reputation at LinkedIn
Keeping It Professional: Relevance, Recommendations, and Reputation at LinkedInDaniel Tunkelang
 
The War on Attention Poverty: Measuring Twitter Authority
The War on Attention Poverty: Measuring Twitter AuthorityThe War on Attention Poverty: Measuring Twitter Authority
The War on Attention Poverty: Measuring Twitter AuthorityDaniel Tunkelang
 

Más de Daniel Tunkelang (20)

Query Understanding and Ecommerce
Query Understanding and EcommerceQuery Understanding and Ecommerce
Query Understanding and Ecommerce
 
Semantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce QueriesSemantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce Queries
 
Helping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query UnderstandingHelping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query Understanding
 
MMM, Search!
MMM, Search!MMM, Search!
MMM, Search!
 
Enterprise Intelligence
Enterprise IntelligenceEnterprise Intelligence
Enterprise Intelligence
 
Query Understanding: A Manifesto
Query Understanding: A ManifestoQuery Understanding: A Manifesto
Query Understanding: A Manifesto
 
Where should you put your data scientists?
Where should you put your data scientists?Where should you put your data scientists?
Where should you put your data scientists?
 
Better Search Through Query Understanding
Better Search Through Query UnderstandingBetter Search Through Query Understanding
Better Search Through Query Understanding
 
Find and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedInFind and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedIn
 
Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?
 
Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem
 
Information, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of NeedsInformation, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of Needs
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The People
 
Content, Connections, and Context
Content, Connections, and ContextContent, Connections, and Context
Content, Connections, and Context
 
Scale, Structure, and Semantics
Scale, Structure, and SemanticsScale, Structure, and Semantics
Scale, Structure, and Semantics
 
Strata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of MicroworkStrata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of Microwork
 
Recommendations as a Conversation with the User
Recommendations as a Conversation with the UserRecommendations as a Conversation with the User
Recommendations as a Conversation with the User
 
Keeping It Professional: Relevance, Recommendations, and Reputation at LinkedIn
Keeping It Professional: Relevance, Recommendations, and Reputation at LinkedInKeeping It Professional: Relevance, Recommendations, and Reputation at LinkedIn
Keeping It Professional: Relevance, Recommendations, and Reputation at LinkedIn
 
The War on Attention Poverty: Measuring Twitter Authority
The War on Attention Poverty: Measuring Twitter AuthorityThe War on Attention Poverty: Measuring Twitter Authority
The War on Attention Poverty: Measuring Twitter Authority
 
Design for Interaction
Design for InteractionDesign for Interaction
Design for Interaction
 

Último

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Último (20)

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

My Three Ex’s: A Data Science Approach for Applied Machine Learning

  • 1. Recruiting Solutions 1 My Three Ex’s: A Data Science Approach for Applied Machine Learning
  • 2. Dedicated to 3 of my favorite ex co-workers.
  • 3. First, a disclosure. This isn’t a talk about machine learning. It’s a talk about applying machine learning. What’s the difference? 3
  • 4. Let’s talk about something else for a moment. Hash Tables 4
  • 5. What you (need to) know about hash tables. Theory Application 5 Class HashMap<K,V> java.lang.Object java.util.AbstractMap<K,V> java.util.HashMap<K,V> Type Parameters: K - the type of keys maintained by this map V - the type of mapped values All Implemented Interfaces: Serializable, Cloneable, Map<K,V>
  • 6. Now let’s get back to machine learning! 6
  • 7. Please allow me to introduce my three ex’s. Express. Explain. Experiment. 7
  • 8. Embrace the data science mindset. Express Understand your utility and inputs. Explain Understand your models and metrics. Experiment Optimize for the speed of learning. 8
  • 10. How to train your machine learning model. 1. Define your objective function. 2. Collect training data. 3. Build models. 4. Profit! 10
  • 11. You can only improve what you measure. 11 Clicks? Actions? Outcomes?
  • 12. Be careful how you define precision… 12
  • 13. Account for non-uniform inputs and costs. 13
  • 14. Stratified sampling is your friend. 14
  • 15. An example of segmenting models. 15 Searcher: Recruiter Query: Person Name Searcher: Job Seeker Query: Person Name Searcher: Recruiter Query: Job Title Searcher: Job Seeker Query: Job Title
  • 16. Express yourself in your feature vectors. 16
  • 17. Express: Summary.  Choose an objective function that models utility.  Be careful how you define precision.  Account for non-uniform inputs and costs.  Stratified sampling is your friend.  Express yourself in your feature vectors. 17
  • 19. With apologies to the little prince. 19
  • 20. Everyone is talking about Deep Learning. 20
  • 21. But accuracy isn’t everything. 21
  • 22. Explainable models, explainable features.  Less is more when it comes to explainability.  Algorithms can protect you from overfitting, but they can’t protect you from the biases you introduce.  Introspection into your models and features makes it easier for you and others to debug them.  Especially if you don’t completely trust your objective function or the representativeness of your training data. 22
  • 23. Linear regression? Decision trees?  Linear regression and decision trees favor explainability over accuracy, compared to more sophisticated models.  But size matters. If you have too many features or too deep a decision tree, you lose explainability.  You can always upgrade to a more sophisticated model when you trust your objective function and training data.  Build a machine learning model is an iterative process. Optimize for the speed of your own learning. 23
  • 24. Explain: Summary.  Accuracy isn’t everything.  Less is more when it comes to explainability.  Don’t knock linear models and decision trees!  Start with simple models, then upgrade. 24
  • 26. Why experiments matter. “You have to kiss a lot of frogs to find one prince. So how can you find your prince faster? By finding more frogs and kissing them faster and faster.” -- Mike Moran 26
  • 27. Life in the age of big data. Yesterday Today 27 Experiments are expensive, choose hypotheses wisely. Experiments are cheap, do as many as you can!
  • 28. So should we just test everything? 28
  • 29. Optimize for the speed of learning. 29 vs
  • 30. Be disciplined: test one variable at a time. • Autocomplete • Entity Tagging • Vertical Intent • # of Suggestions • Suggestion Order • Language • Query Construction • Ranking Model 30
  • 31. Experiment: Summary.  Kiss lots of frogs: experiments are cheap.  But test in good faith – don’t just flip coins.  Optimize for the speed of learning.  Be disciplined: test one variable at a time. 31
  • 32. Bringing it all together. Express Understand your utility and inputs. Explain Understand your models and metrics. Experiment Optimize for the speed of learning. 32
  • 33. 33 Daniel Tunkelang dtunkelang@linkedin.com https://linkedin.com/in/dtunkelang