This document discusses challenges and opportunities in applying machine learning. It argues that machine learning projects require data analysts to define business metrics, ensure data quality, and help machine learning models target the right problems. While machine learning skills are still specialized, tools are getting easier to use over time through approaches like transfer learning, AutoML, and BigQuery ML. Domain expertise remains crucial to ensure models optimize for real business impact rather than just accuracy. Overall, the document advocates a combined focus on both data analysis and machine learning concepts to successfully apply machine learning.
19. What does Machine Learning
need to deliver business impact?
1. Data Governance
2. KPI definition
3. Agile delivery
4. Dev Ops support of infrastructure
5. Model accuracy
20. Time spent by a data science team
The same proportion should apply to people’s skills
https://medium.com/thelaunchpad/the-ml-surprise-f54706361a6c
23. How can data analysts help
machine learning projects?
24. • Fast and shallow analysis [1]
• Gatekeeper for which problems
should have deep-dive by machine
learning
• Definition on what metrics to
optimise
• Knowledge of data sources and
what is missing
[1] https://hbr.org/2018/12/what-great-data-analysts-do-and-why-every-organization-needs-them
plus machine learning
concepts
25. Start with business case
Don’t spend 6 months building infrastructure
Start here
26. What are good ML KPIs?
Bounce Rate
Engagement
Search Query
Conversions
Customer segments
Trend predictions
27. Know your error metric
https://dzone.com/articles/understanding-the-confusion-matrix
29. Predicting 100 conversions
3% conversion rate
Most “accurate” model predicts no conversions
Predicted
Conversion
Predicted Non
Conversion
Real Conversion 0 3
Real Non
Conversion
0 97
Precision = 0
Recall = 0
Accuracy = 0.97
30. We are a long way from machine
learning knowing how to pick the
right KPI and data
31. We are much closer to machine
learning knowing how to create
machine learning models…
32. Custom code > Tensorflow > Keras > AutoML > ML APIs > BigQuery ML > ?
Specialist skills Analyst skills
General applications Use case driven
Custom models Off the shelf
2010+ > 2019
33. Transfer Learning
Take advantage of pre-trained models, add custom final
layer
https://medium.com/tensorflow/a-look-at-how-we-built-the-emoji-scavenger-hunt-using-tensorflow-js-3d760a7ebfe6?linkId=58001921
34. AutoML
Machine Learning on Machine Learning
https://www.forbes.com/sites/janakirammsv/2018/04/15/why-automl-is-set-to-become-the-future-of-artificial-intelligence/
37. AutoML
performance can
be better than
custom code
Lak goes through applying different models to
text classification:
• Keras - 80% accurate
• BigQuery ML - 78% accurate
• Google AutoML - 86% accurate
https://towardsdatascience.com/choosing-between-tensorflow-keras-bigquery-ml-and-automl-natural-language-for-text-classification-6b1c9fc21013
40. Summary
Machine learning projects that deliver need
data analysts
Barrier to using machine learning getting
lower every day
Learning machine learning concepts
valuable even if not coding