SlideShare una empresa de Scribd logo
1 de 33
Descargar para leer sin conexión
(Py)Testing the Limits of
Machine Learning
Rebecca Bilbro ⩓ Daniel Sollis ⩓ Patrick
Deziel
01. Introduction
Why test ML?
02.
DIY Testing API
Building blocks of a good
ML test suite
03.
Non-Determinism
Keeping your head when
the models act up
04.
Experiment with Care
ML diagnostics for
experimental robustness
05.
Conclusion
Level up your ML game
with these testing tips &
tricks
Why test ML?
01
Do we
need to
test ML
code?
“Testing is for software,
not data science.”
“It’s a waste of time to
test experimental
research code.”
“We follow hypothesis-driven
development, not test-driven
development.”
Can we
test ML
code?
“Machine learning algorithms are non-deterministic,
so there’s no way to test them.”
“Our Jupyter notebooks
don’t support test runners.”
“Machine learning has too many
parameters to test them all.”
Bottom Line
If it’s going into a product,
it needs to be tested.
Building blocks
of a good ML
test suite
02
Estimators and Transformers
Inheriting from the
Estimator() and
Transformer()
sklearn classes
allows you to
overload existing
methods.
Allows you to
generalize various
models and
transformations in
sklearn.
Doing this allows the
consistent use of
pipelines across
both preprocessing
as well as modeling.
Transformer
fit()
transform()
Estimator
fit()
predict()
X, y
X, y
ŷ
X′
Creating a Wrapper
ModelWrapper
fit() transform()
predict()
Transformer
Estimator
Estimator Transformer
Inheriting & Overloading
Pipelines and FeatureUnions
The Pipeline and
FeatureUnion features in
SKLearn allow you to
organize preprocessing
and modeling, letting you
quickly iterate through
experiments.
Pipelines are meant for
use with simple modeling,
while FeatureUnions are
meant for parallelizable
tasks. By creating a
wrapper class using these
features becomes even
easier.
Data Loader
Transformer
Transformer
Estimator
fit()
predict()
pipeline = Pipeline([
('extract_essays', EssayExtractor()),
('counts', CountVectorizer()),
('tf_idf', TfidfTransformer()),
('classifier', MultinomialNB())
])
pipeline.fit_transform(X_train, y_train)
y_pred = pipeline.predict()
Create a pipeline that
loads data from a file
on disk, extracts each
instance as an
individual essay, then
applies text feature
extraction before a
text classification
model.
Pipeline
Example
extract_essays
counts
tf_idf
classifier
http://zacstewart.com/2014/08/05/pipelines-of-featureunions-of-pipelines.html
http://zacstewart.com/2014/08/05/pipelines-of-featureunions-of-pipelines.html
feature_union
extract_essays
counts
tf_idf
classifier
document meta concepts
DictVectorizer DictVectorizer
Feature
Union
pipeline = Pipeline([
('extract_essays', EssayExractor()),
('features', FeatureUnion([
('ngram_tf_idf', Pipeline([
('counts', CountVectorizer()),
('tf_idf', TfidfTransformer())
])),
('essay_length', LengthTransformer()),
('misspellings',
MispellingCountTransformer())
])),
('classifier', MultinomialNB())
])
We Use Pre-Commit in addition to
Black to ensure that our repository
stays clean and unified across
commits.
Coding Style and Enforcement
Part of Keeping our Standards high
is enforcing an agreed upon coding
style and sticking to it.
The Double Edged Sword of Black
python -m black '.file.py'
CI/CD With Jenkins
Using Jenkins for build testing helps
keep the whole team on the same
page as well as enforcing the teams
testing standards.
Automating builds in addition to
local testing helps to ensure that
code works in different
environments/machines.
Push
Pre-Commit
Black
Jenkins
Build/Testing
CICD Flow
Dealing with
Non-Determinism
03
Testing an ML Pipeline
● How do we handle non-determinism in our pipeline?
● How do we test multiple parameters in our pipeline?
● How do we handle small variations in our pipeline?
Scikit-learn
Pipeline
https://www.freecodecamp.org/news/chihuahua-or-muffin-my-search-for-the-best-computer-vision-api-cbda4d6b425d/
Different Data, Different Results
Scikit-learn
Pipeline
Muffin Dog
Scikit-learn
Pipeline
Muffin Dog
Train Test Test Train
Different Executions, Different Results
Train Test
Scikit-learn
Pipeline
Muffin Dog
Scikit-learn
Pipeline
Muffin Dog
Ensuring Reproducibility
● Fixing the random seed can ensure reproducibility across
executions of the same code.
● Scikit-learn provides a random_state parameter for each
non-deterministic function which allows the user to fix the
random seed.
class sklearn.neural_network.MLPClassifier(hidden_layer_sizes=100,
activation='relu', *, solver='adam', alpha=0.0001, batch_size='auto',
learning_rate='constant', learning_rate_init=0.001, power_t=0.5, max_iter=200,
shuffle=True, random_state=None, tol=0.0001, verbose=False, warm_start=False,
momentum=0.9, nesterovs_momentum=True, early_stopping=False,
validation_fraction=0.1, beta_1=0.9, beta_2=0.999, epsilon=1e-08,
n_iter_no_change=10, max_fun=15000)
https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html
Using random_state
● Our function will now produce the same results on
different executions if we pass it the same data.
(Py)Testing Our Function
● ML comes with an abundance of options.
● How do we test multiple parameters without
turning our test code into spaghetti?
Using pytest.parametrize
Dealing With Inevitable Variations
● With floating point arithmetic, things can get...strange.
● In order to correctly test ML, we need a better way to
compare floating point results.
● We need a method of handling results that are “close
enough”.
○ E.g., Training time
Using pytest.approx
Diagnostics for
Machine
Learning
04
Engineering vs. Experimentation
What if it’s a false dichotomy?
Data Loader
Transformer(s)
Feature
Visualization
fit()
transform()
draw()
Data Loader
Transformer(s)
Estimator
Evaluation
Visualization
fit()
predict()
score()
draw()
The Yellowbrick API
dog
muffin
import matplotlib.pyplot as plt
from sklearn.linear_model import SGDClassifier
from sklearn.ensemble import RandomForestClassifier
from yellowbrick.classifier import ClassificationReport
from sklearn.model_selection import train_test_split as tts
def muffins_or_dogs(X, y, model, classes=["dog", "muffin"]):
fig, ax = plt.subplots()
X_train, X_test, y_train, y_test = tts(X, y, random_state=38)
visualizer = ClassificationReport(
model, classes=classes, cmap="Greys", ax=ax,
support=True, show=False
)
visualizer.fit(X_train, y_train)
score = visualizer.score(X_test, y_test)
image_path = visualizer.estimator.__class__.__name__ + ".png"
visualizer.show(outpath=image_path)
return visualizer.estimator.predict(X_test)
Tips & Tricks
Leverage an ML API
Systematize tests by
wrapping open source ML
frameworks
Pipeline ML Steps
Chain ML steps to support
accuracy &
reproducibility
Drill into Fuzziness
Use parameterization &
approximation to deal with
non-determinism
Embrace Consistency
Adopt a team-wide
coding style to facilitate
collaboration
Befriend Small Robots
CI/CD helps flag test
regressions &
dependency changes
Experiment with Care
Use diagnostic tools
that don’t interfere
with testability
Thank you!
Template by SlidesGo
Icons by Flaticon
Images by Freepik

Más contenido relacionado

La actualidad más candente

GA.-.Presentation
GA.-.PresentationGA.-.Presentation
GA.-.Presentation
oldmanpat
 

La actualidad más candente (20)

Introduction to Machine Learning in Python using Scikit-Learn
Introduction to Machine Learning in Python using Scikit-LearnIntroduction to Machine Learning in Python using Scikit-Learn
Introduction to Machine Learning in Python using Scikit-Learn
 
VSSML16 LR1. Summary Day 1
VSSML16 LR1. Summary Day 1VSSML16 LR1. Summary Day 1
VSSML16 LR1. Summary Day 1
 
Winning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to StackingWinning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to Stacking
 
Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21
 
Machine learning with scikitlearn
Machine learning with scikitlearnMachine learning with scikitlearn
Machine learning with scikitlearn
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Data Product Architectures
Data Product ArchitecturesData Product Architectures
Data Product Architectures
 
General Tips for participating Kaggle Competitions
General Tips for participating Kaggle CompetitionsGeneral Tips for participating Kaggle Competitions
General Tips for participating Kaggle Competitions
 
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
 
Data Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science CompetitionsData Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science Competitions
 
Machine Learning Overview
Machine Learning OverviewMachine Learning Overview
Machine Learning Overview
 
Feature Engineering
Feature Engineering Feature Engineering
Feature Engineering
 
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
 
Ppt shuai
Ppt shuaiPpt shuai
Ppt shuai
 
GA.-.Presentation
GA.-.PresentationGA.-.Presentation
GA.-.Presentation
 
Winning data science competitions
Winning data science competitionsWinning data science competitions
Winning data science competitions
 
Kaggle presentation
Kaggle presentationKaggle presentation
Kaggle presentation
 
Robust and declarative machine learning pipelines for predictive buying at Ba...
Robust and declarative machine learning pipelines for predictive buying at Ba...Robust and declarative machine learning pipelines for predictive buying at Ba...
Robust and declarative machine learning pipelines for predictive buying at Ba...
 
AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)
 

Similar a (Py)testing the Limits of Machine Learning

Machine Learning Pipelines - Joseph Bradley - Databricks
Machine Learning Pipelines - Joseph Bradley - DatabricksMachine Learning Pipelines - Joseph Bradley - Databricks
Machine Learning Pipelines - Joseph Bradley - Databricks
Spark Summit
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
Stepan Pushkarev
 

Similar a (Py)testing the Limits of Machine Learning (20)

Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science Meetup
 
Key projects in AI, ML and Generative AI
Key projects in AI, ML and Generative AIKey projects in AI, ML and Generative AI
Key projects in AI, ML and Generative AI
 
housing price prediction ppt in artificial
housing price prediction ppt in artificialhousing price prediction ppt in artificial
housing price prediction ppt in artificial
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
 
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML ToolkitAugmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple steps
 
Machine Learning Pipelines - Joseph Bradley - Databricks
Machine Learning Pipelines - Joseph Bradley - DatabricksMachine Learning Pipelines - Joseph Bradley - Databricks
Machine Learning Pipelines - Joseph Bradley - Databricks
 
What are the Unique Challenges and Opportunities in Systems for ML?
What are the Unique Challenges and Opportunities in Systems for ML?What are the Unique Challenges and Opportunities in Systems for ML?
What are the Unique Challenges and Opportunities in Systems for ML?
 
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML ToolkitAugmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
 
Workshop: Your first machine learning project
Workshop: Your first machine learning projectWorkshop: Your first machine learning project
Workshop: Your first machine learning project
 
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
 
Taking your machine learning workflow to the next level using Scikit-Learn Pi...
Taking your machine learning workflow to the next level using Scikit-Learn Pi...Taking your machine learning workflow to the next level using Scikit-Learn Pi...
Taking your machine learning workflow to the next level using Scikit-Learn Pi...
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
 
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
 
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learningUtilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learning
 
Aws autopilot
Aws autopilotAws autopilot
Aws autopilot
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
 
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
Managing the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflowManaging the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflow
 

Más de Rebecca Bilbro

Conflict-Free Replicated Data Types (PyCon 2022)
Conflict-Free Replicated Data Types (PyCon 2022)Conflict-Free Replicated Data Types (PyCon 2022)
Conflict-Free Replicated Data Types (PyCon 2022)
Rebecca Bilbro
 
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Rebecca Bilbro
 
Data Intelligence 2017 - Building a Gigaword Corpus
Data Intelligence 2017 - Building a Gigaword CorpusData Intelligence 2017 - Building a Gigaword Corpus
Data Intelligence 2017 - Building a Gigaword Corpus
Rebecca Bilbro
 
Building a Gigaword Corpus (PyCon 2017)
Building a Gigaword Corpus (PyCon 2017)Building a Gigaword Corpus (PyCon 2017)
Building a Gigaword Corpus (PyCon 2017)
Rebecca Bilbro
 

Más de Rebecca Bilbro (15)

Data Structures for Data Privacy: Lessons Learned in Production
Data Structures for Data Privacy: Lessons Learned in ProductionData Structures for Data Privacy: Lessons Learned in Production
Data Structures for Data Privacy: Lessons Learned in Production
 
Conflict-Free Replicated Data Types (PyCon 2022)
Conflict-Free Replicated Data Types (PyCon 2022)Conflict-Free Replicated Data Types (PyCon 2022)
Conflict-Free Replicated Data Types (PyCon 2022)
 
Anti-Entropy Replication for Cost-Effective Eventual Consistency
Anti-Entropy Replication for Cost-Effective Eventual ConsistencyAnti-Entropy Replication for Cost-Effective Eventual Consistency
Anti-Entropy Replication for Cost-Effective Eventual Consistency
 
The Promise and Peril of Very Big Models
The Promise and Peril of Very Big ModelsThe Promise and Peril of Very Big Models
The Promise and Peril of Very Big Models
 
Beyond Off the-Shelf Consensus
Beyond Off the-Shelf ConsensusBeyond Off the-Shelf Consensus
Beyond Off the-Shelf Consensus
 
Visual diagnostics at scale
Visual diagnostics at scaleVisual diagnostics at scale
Visual diagnostics at scale
 
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
 
A Visual Exploration of Distance, Documents, and Distributions
A Visual Exploration of Distance, Documents, and DistributionsA Visual Exploration of Distance, Documents, and Distributions
A Visual Exploration of Distance, Documents, and Distributions
 
Words in space
Words in spaceWords in space
Words in space
 
Camlis
CamlisCamlis
Camlis
 
Learning machine learning with Yellowbrick
Learning machine learning with YellowbrickLearning machine learning with Yellowbrick
Learning machine learning with Yellowbrick
 
Data Intelligence 2017 - Building a Gigaword Corpus
Data Intelligence 2017 - Building a Gigaword CorpusData Intelligence 2017 - Building a Gigaword Corpus
Data Intelligence 2017 - Building a Gigaword Corpus
 
Building a Gigaword Corpus (PyCon 2017)
Building a Gigaword Corpus (PyCon 2017)Building a Gigaword Corpus (PyCon 2017)
Building a Gigaword Corpus (PyCon 2017)
 
NLP for Everyday People
NLP for Everyday PeopleNLP for Everyday People
NLP for Everyday People
 
Commerce Data Usability Project
Commerce Data Usability ProjectCommerce Data Usability Project
Commerce Data Usability Project
 

Último

Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 

Último (20)

Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 

(Py)testing the Limits of Machine Learning

  • 1. (Py)Testing the Limits of Machine Learning Rebecca Bilbro ⩓ Daniel Sollis ⩓ Patrick Deziel
  • 2. 01. Introduction Why test ML? 02. DIY Testing API Building blocks of a good ML test suite 03. Non-Determinism Keeping your head when the models act up 04. Experiment with Care ML diagnostics for experimental robustness 05. Conclusion Level up your ML game with these testing tips & tricks
  • 4. Do we need to test ML code? “Testing is for software, not data science.” “It’s a waste of time to test experimental research code.” “We follow hypothesis-driven development, not test-driven development.”
  • 5. Can we test ML code? “Machine learning algorithms are non-deterministic, so there’s no way to test them.” “Our Jupyter notebooks don’t support test runners.” “Machine learning has too many parameters to test them all.”
  • 6. Bottom Line If it’s going into a product, it needs to be tested.
  • 7. Building blocks of a good ML test suite 02
  • 8. Estimators and Transformers Inheriting from the Estimator() and Transformer() sklearn classes allows you to overload existing methods. Allows you to generalize various models and transformations in sklearn. Doing this allows the consistent use of pipelines across both preprocessing as well as modeling. Transformer fit() transform() Estimator fit() predict() X, y X, y ŷ X′
  • 9. Creating a Wrapper ModelWrapper fit() transform() predict() Transformer Estimator Estimator Transformer Inheriting & Overloading
  • 10. Pipelines and FeatureUnions The Pipeline and FeatureUnion features in SKLearn allow you to organize preprocessing and modeling, letting you quickly iterate through experiments. Pipelines are meant for use with simple modeling, while FeatureUnions are meant for parallelizable tasks. By creating a wrapper class using these features becomes even easier. Data Loader Transformer Transformer Estimator fit() predict()
  • 11. pipeline = Pipeline([ ('extract_essays', EssayExtractor()), ('counts', CountVectorizer()), ('tf_idf', TfidfTransformer()), ('classifier', MultinomialNB()) ]) pipeline.fit_transform(X_train, y_train) y_pred = pipeline.predict() Create a pipeline that loads data from a file on disk, extracts each instance as an individual essay, then applies text feature extraction before a text classification model. Pipeline Example extract_essays counts tf_idf classifier http://zacstewart.com/2014/08/05/pipelines-of-featureunions-of-pipelines.html
  • 12. http://zacstewart.com/2014/08/05/pipelines-of-featureunions-of-pipelines.html feature_union extract_essays counts tf_idf classifier document meta concepts DictVectorizer DictVectorizer Feature Union pipeline = Pipeline([ ('extract_essays', EssayExractor()), ('features', FeatureUnion([ ('ngram_tf_idf', Pipeline([ ('counts', CountVectorizer()), ('tf_idf', TfidfTransformer()) ])), ('essay_length', LengthTransformer()), ('misspellings', MispellingCountTransformer()) ])), ('classifier', MultinomialNB()) ])
  • 13. We Use Pre-Commit in addition to Black to ensure that our repository stays clean and unified across commits. Coding Style and Enforcement Part of Keeping our Standards high is enforcing an agreed upon coding style and sticking to it.
  • 14. The Double Edged Sword of Black python -m black '.file.py'
  • 15. CI/CD With Jenkins Using Jenkins for build testing helps keep the whole team on the same page as well as enforcing the teams testing standards. Automating builds in addition to local testing helps to ensure that code works in different environments/machines. Push Pre-Commit Black Jenkins Build/Testing CICD Flow
  • 17. Testing an ML Pipeline ● How do we handle non-determinism in our pipeline? ● How do we test multiple parameters in our pipeline? ● How do we handle small variations in our pipeline? Scikit-learn Pipeline https://www.freecodecamp.org/news/chihuahua-or-muffin-my-search-for-the-best-computer-vision-api-cbda4d6b425d/
  • 18. Different Data, Different Results Scikit-learn Pipeline Muffin Dog Scikit-learn Pipeline Muffin Dog Train Test Test Train
  • 19. Different Executions, Different Results Train Test Scikit-learn Pipeline Muffin Dog Scikit-learn Pipeline Muffin Dog
  • 20. Ensuring Reproducibility ● Fixing the random seed can ensure reproducibility across executions of the same code. ● Scikit-learn provides a random_state parameter for each non-deterministic function which allows the user to fix the random seed. class sklearn.neural_network.MLPClassifier(hidden_layer_sizes=100, activation='relu', *, solver='adam', alpha=0.0001, batch_size='auto', learning_rate='constant', learning_rate_init=0.001, power_t=0.5, max_iter=200, shuffle=True, random_state=None, tol=0.0001, verbose=False, warm_start=False, momentum=0.9, nesterovs_momentum=True, early_stopping=False, validation_fraction=0.1, beta_1=0.9, beta_2=0.999, epsilon=1e-08, n_iter_no_change=10, max_fun=15000) https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html
  • 21. Using random_state ● Our function will now produce the same results on different executions if we pass it the same data.
  • 22. (Py)Testing Our Function ● ML comes with an abundance of options. ● How do we test multiple parameters without turning our test code into spaghetti?
  • 24. Dealing With Inevitable Variations ● With floating point arithmetic, things can get...strange. ● In order to correctly test ML, we need a better way to compare floating point results. ● We need a method of handling results that are “close enough”. ○ E.g., Training time
  • 27. Engineering vs. Experimentation What if it’s a false dichotomy?
  • 28.
  • 31. import matplotlib.pyplot as plt from sklearn.linear_model import SGDClassifier from sklearn.ensemble import RandomForestClassifier from yellowbrick.classifier import ClassificationReport from sklearn.model_selection import train_test_split as tts def muffins_or_dogs(X, y, model, classes=["dog", "muffin"]): fig, ax = plt.subplots() X_train, X_test, y_train, y_test = tts(X, y, random_state=38) visualizer = ClassificationReport( model, classes=classes, cmap="Greys", ax=ax, support=True, show=False ) visualizer.fit(X_train, y_train) score = visualizer.score(X_test, y_test) image_path = visualizer.estimator.__class__.__name__ + ".png" visualizer.show(outpath=image_path) return visualizer.estimator.predict(X_test)
  • 32. Tips & Tricks Leverage an ML API Systematize tests by wrapping open source ML frameworks Pipeline ML Steps Chain ML steps to support accuracy & reproducibility Drill into Fuzziness Use parameterization & approximation to deal with non-determinism Embrace Consistency Adopt a team-wide coding style to facilitate collaboration Befriend Small Robots CI/CD helps flag test regressions & dependency changes Experiment with Care Use diagnostic tools that don’t interfere with testability
  • 33. Thank you! Template by SlidesGo Icons by Flaticon Images by Freepik