SlideShare una empresa de Scribd logo
1 de 40
Descargar para leer sin conexión
What are the Unique Challenges and
Opportunities in Systems for ML?
Matei Zaharia
😀 🙂
AI is going to
change all of
computing!
AI Researcher Systems Researcher
😀 🙂
It’s intelligent and you
don’t need to program
anymore and you just
differentiate things...
AI Researcher Systems Researcher
😀 🙂
How does it affect
your research
field?
AI Researcher Systems Researcher
😀 🙂
How does it affect
your research
field?
AI Researcher Systems Researcher
Umm, I figured out
a way to shave off
some system calls!
😀 🙂
How does it affect
your research
field?
AI Researcher Networking Researcher
I came up with a
new congestion
control scheme
😐 🙂AI Researcher Networking Researcher
I came up with a
new congestion
control scheme
Motivation
ML workloads can certainly influence a lot of systems,
but what are the unique research challenges they raise?
Turns out there are a lot! ML is very different from
traditional software, and we should look at how
My Perspective
Research lab focused on infrastructure for
usable machine learning
Data & ML platform for 2000+ orgs
How Does ML Differ from Traditional Software?
Traditional Software
Goal: meet a functional
specification
Quality depends only on
application code
Mostly deterministic
Machine Learning
Goal: optimize a metric
(e.g. accuracy)
Quality depends on input data
and tuning parameters
Stochastic
Some Interesting Opportunities
ML Platforms: software for managing and productionizing ML
Data-oriented model training, QA and debugging tools
Optimizations leveraging the stochastic nature of ML
ML-Aware System Optimization:
NoScope & BlazeIt
The ML Inference Bottleneck
Inference cost is often 100x higher than training
overall, and greatly limits deployments
Example: processing 1 video
stream in real time with CNNs
requires a $1000 GPU
Inference Optimization in NoScope
Idea: optimize execution of ML models for a specific
application or query
• Model specialization: train a small DNN to recognize the
specific class in the dataset (e.g. “buses in street video”)
• Query optimization: tune a cascade of
models to achieve a target accuracy Target
Model
Specialized
Model
Dataset
User Query
NoScope Results
VLDB ‘17, github.com/stanford-futuredata/noscope
Optimizing ML + SQL in BlazeIt
[Kang et al, CIDR 2019]
Object Detection DNN
Frames from Video
Query Plan with
Specialized DNNs
Resnet 50
SQL Query
BlazeIt Optimizations
Accelerate approximate queries by
using specialized model’s output
as a control variate for sampling
E.g.: find average # of cars/frame
Use specialized models to sort
frames by likelihood of matching
query, then run full model
E.g.: SELECT * FROM frames
WHERE #(red buses) > 3 LIMIT 5
Aggregation Queries Limit Queries
BlazeIt Results
Aggregation Queries Limit Queries
Quality Assurance for ML with
Model Assertions
Motivation
ML applications fail in complex, hard-to-debug ways
• Tesla cars crashing into lane dividers
• Gender classification incorrect
based on race
How can we test and improve quality of ML apps?
Model Assertions
Predicates on input/output of an ML application
(similar to software assertions)
[Kang, Raghavan et al, NeurIPS MLSys 2018]
Frame 1 Frame 2 Frame 3
assert(cars should not flicker in and out)
Improved training
(data selection &
weak supervision)
Runtime
monitoring
Example Assertions
Problem Domain Assertion
Video analytics
Objects should not flicker
in and out across frames
Autonomous vehicles
LIDAR and video object
detectors should agree
Heart rhythm
classification
Output class should not
change frequently
Using Model Assertions
Inference time
» Runtime monitoring
» Corrective action
Training time
» Active learning
» Weak supervision via
correction rules
Active Learning with Assertions:
Can assertions help select data to label & train on?
Key idea: new active learning algorithm samples data that
is most likely to reduce # failing assertions
Active Learning with Assertions:
Can assertions help select data to label & train on?
Using assertions
for active learning
improves model
quality.
Selection Method for 2000 New Labels
mAP
Weak Supervision with Assertions:
Can assertions improve quality without human labeling?
Key idea: consistency constraints API lets devs say which
attributes should stay constant across outputs in a dataset
E.g. “each tracked object should always have same class”,
“each person should have consistent detected gender”
Task Pretrained Weakly Supervised
AV perception (mAP) 10.6 14.1 (+33%)
Object detection (mAP) 34.4 49.9 (+45%)
ECG (% accuracy) 70.7 72.1 (+2%)
Weak Supervision with Assertions:
Can assertions improve quality without human labeling?
Model Quality After Retraining
Retrained SSD ModelOriginal SSD Model
[Kang, Raghavan et al, NeurIPS MLSys 2018]
ML Platforms: Programming and
Deployment Systems for ML
ML at Industrial Scale
Today, ML development is ad-hoc:
• Hard to track experiments & metrics: users do it best-effort
• Hard to reproduce results: won’t happen by default
• Hard to share & deploy models: different dev & deploy stacks
Each app takes months to build, and then needs to
continuously be maintained!
ML Platforms
A new class of systems to manage the ML lifecycle
Pioneered by company-specific platforms: Facebook
FBLearner, Uber Michelangelo, Google TFX, etc
+Standardize the data prep / training / deploy cycle:
if you work with the platform, you get these!
–Limited to a few algorithms or frameworks
–Tied to one company’s infrastructure
MLflow from Databricks
Open source, open-interface ML platform (mlflow.org)
• Works with any existing ML library and deployment service
Project
Project Spec
your_code.py
. . .
log_param(“alpha”, 0.5)
log_metric(“rmse”, 0.2)
log_model(my_model)
. . .
Deps Params
Tracking Server
UI
API
Inference Code
Bulk Scoring
Cloud Serving Tools
Deployment TargetsExperiment TrackingReproducible Projects
REST
API
my_project/
├── MLproject
│
│
│
│
│
├── conda.yaml
├── main.py
└── model.py
...
MLflow Projects: Reproducible Runs
conda_env: conda.yaml
entry_points:
main:
parameters:
training_data: path
lr: {type: float, default: 0.1}
command: python main.py {training_data} {lr}
$ mlflow run git://<my_project>
mlflow.run(“git://<my_project>”, ...)
Simple packaging format for code + dependencies
Composing Projects
r1 = mlflow.run(“ProjectA”, params)
if r1 > 0:
r2 = mlflow.run(“ProjectB”, …)
else:
r2 = mlflow.run(“ProjectC”, …)
r3 = mlflow.run(“ProjectD”, r2)
MLflow Tracking: Logging for ML
Notebooks
Local Apps
Cloud Jobs
Tracking Server
UI
API
mlflow.log_param(“alpha”, 0.5)
mlflow.log_metric(“accuracy”, 0.9)
...
REST API
Tracking UI: Inspecting Runs
Model Format
ONNX Flavor
Python Flavor
Model Logic
Batch Inference
REST Serving
Packaging Format
. . .
Testing & Debug Tools
LIME
TCAV
Packages arbitrary code (not just model weights)
MLflow Models: Packaging Models
MLflow Community Growth
140 contributors from >50 companies since June 2018
850K downloads/month
Major external contributions:
• Docker & Kubernetes execution
• R API
• Integrations with PyTorch, H2O, HDFS, GCS, …
• Plugin system
Other ML-Specific Research Opportunities
Data validation and monitoring (e.g. TFX Data Validation)
Supervision-oriented systems (e.g. Snorkel, Overton)
Leveraging the numeric nature of ML for optimization,
security, etc (e.g. TASO, HogWild, SSP, federated ML)
Conclusion
Many systems problems specific to ML are not
heavily studied in research
• App lifecycle, data quality & monitoring, model QA, etc
These are also major problems in practice!
Follow DAWN’s research at dawn.cs.stanford.edu

Más contenido relacionado

La actualidad más candente

Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...Databricks
 
Simplifying Model Management with MLflow
Simplifying Model Management with MLflowSimplifying Model Management with MLflow
Simplifying Model Management with MLflowDatabricks
 
Scaling Databricks to Run Data and ML Workloads on Millions of VMs
Scaling Databricks to Run Data and ML Workloads on Millions of VMsScaling Databricks to Run Data and ML Workloads on Millions of VMs
Scaling Databricks to Run Data and ML Workloads on Millions of VMsMatei Zaharia
 
Hamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreHamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreMoritz Meister
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle Management
MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle ManagementMLflow and Azure Machine Learning—The Power Couple for ML Lifecycle Management
MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle ManagementDatabricks
 
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML ToolkitAugmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML ToolkitDatabricks
 
Productionalizing Models through CI/CD Design with MLflow
Productionalizing Models through CI/CD Design with MLflowProductionalizing Models through CI/CD Design with MLflow
Productionalizing Models through CI/CD Design with MLflowDatabricks
 
MLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine LearningMLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine LearningMatei Zaharia
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_futureNisha Talagala
 
High Performance Transfer Learning for Classifying Intent of Sales Engagement...
High Performance Transfer Learning for Classifying Intent of Sales Engagement...High Performance Transfer Learning for Classifying Intent of Sales Engagement...
High Performance Transfer Learning for Classifying Intent of Sales Engagement...Databricks
 
Productionizing Deep Reinforcement Learning with Spark and MLflow
Productionizing Deep Reinforcement Learning with Spark and MLflowProductionizing Deep Reinforcement Learning with Spark and MLflow
Productionizing Deep Reinforcement Learning with Spark and MLflowDatabricks
 
MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle Databricks
 
ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to productionHerman Wu
 
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureMLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureData Science Milan
 
Challenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in ProductionChallenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in Productioniguazio
 
What's Next for MLflow in 2019
What's Next for MLflow in 2019What's Next for MLflow in 2019
What's Next for MLflow in 2019Anyscale
 
3 App Compat Win7
3 App Compat Win73 App Compat Win7
3 App Compat Win7llangit
 
Managers guide to effective building of machine learning products
Managers guide to effective building of machine learning productsManagers guide to effective building of machine learning products
Managers guide to effective building of machine learning productsGianmario Spacagna
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsDataPhoenix
 

La actualidad más candente (20)

Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
 
Simplifying Model Management with MLflow
Simplifying Model Management with MLflowSimplifying Model Management with MLflow
Simplifying Model Management with MLflow
 
Scaling Databricks to Run Data and ML Workloads on Millions of VMs
Scaling Databricks to Run Data and ML Workloads on Millions of VMsScaling Databricks to Run Data and ML Workloads on Millions of VMs
Scaling Databricks to Run Data and ML Workloads on Millions of VMs
 
Hamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreHamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature Store
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle Management
MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle ManagementMLflow and Azure Machine Learning—The Power Couple for ML Lifecycle Management
MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle Management
 
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML ToolkitAugmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
 
Productionalizing Models through CI/CD Design with MLflow
Productionalizing Models through CI/CD Design with MLflowProductionalizing Models through CI/CD Design with MLflow
Productionalizing Models through CI/CD Design with MLflow
 
MLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine LearningMLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine Learning
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_future
 
High Performance Transfer Learning for Classifying Intent of Sales Engagement...
High Performance Transfer Learning for Classifying Intent of Sales Engagement...High Performance Transfer Learning for Classifying Intent of Sales Engagement...
High Performance Transfer Learning for Classifying Intent of Sales Engagement...
 
Productionizing Deep Reinforcement Learning with Spark and MLflow
Productionizing Deep Reinforcement Learning with Spark and MLflowProductionizing Deep Reinforcement Learning with Spark and MLflow
Productionizing Deep Reinforcement Learning with Spark and MLflow
 
MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle
 
ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to production
 
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureMLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
 
Challenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in ProductionChallenges of Operationalising Data Science in Production
Challenges of Operationalising Data Science in Production
 
What's Next for MLflow in 2019
What's Next for MLflow in 2019What's Next for MLflow in 2019
What's Next for MLflow in 2019
 
3 App Compat Win7
3 App Compat Win73 App Compat Win7
3 App Compat Win7
 
Managers guide to effective building of machine learning products
Managers guide to effective building of machine learning productsManagers guide to effective building of machine learning products
Managers guide to effective building of machine learning products
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 

Similar a What are the Unique Challenges and Opportunities in Systems for ML?

201909 Automated ML for Developers
201909 Automated ML for Developers201909 Automated ML for Developers
201909 Automated ML for DevelopersMark Tabladillo
 
Walk through of azure machine learning studio new features
Walk through of azure machine learning studio new featuresWalk through of azure machine learning studio new features
Walk through of azure machine learning studio new featuresLuca Zavarella
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...MLconf
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareJustin Basilico
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareJustin Basilico
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixJustin Basilico
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software EngineeringMiroslaw Staron
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflowDatabricks
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...Robert Grossman
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkIvo Andreev
 
Build, Train, and Deploy ML Models at Scale
Build, Train, and Deploy ML Models at ScaleBuild, Train, and Deploy ML Models at Scale
Build, Train, and Deploy ML Models at ScaleAmazon Web Services
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowLviv Startup Club
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowEdunomica
 
Build, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdfBuild, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdfAmazon Web Services
 
Build, train, and deploy ML models at scale.pdf
Build, train, and deploy ML models at scale.pdfBuild, train, and deploy ML models at scale.pdf
Build, train, and deploy ML models at scale.pdfAmazon Web Services
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
 
201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0Mark Tabladillo
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?Ivo Andreev
 

Similar a What are the Unique Challenges and Opportunities in Systems for ML? (20)

201909 Automated ML for Developers
201909 Automated ML for Developers201909 Automated ML for Developers
201909 Automated ML for Developers
 
Walk through of azure machine learning studio new features
Walk through of azure machine learning studio new featuresWalk through of azure machine learning studio new features
Walk through of azure machine learning studio new features
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 
Introduction to ML.NET
Introduction to ML.NETIntroduction to ML.NET
Introduction to ML.NET
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
Build, Train, and Deploy ML Models at Scale
Build, Train, and Deploy ML Models at ScaleBuild, Train, and Deploy ML Models at Scale
Build, Train, and Deploy ML Models at Scale
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with Kubeflow
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with Kubeflow
 
TechDayPakistan-Slides RAG with Cosmos DB.pptx
TechDayPakistan-Slides RAG with Cosmos DB.pptxTechDayPakistan-Slides RAG with Cosmos DB.pptx
TechDayPakistan-Slides RAG with Cosmos DB.pptx
 
Build, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdfBuild, train and deploy ML models at scale.pdf
Build, train and deploy ML models at scale.pdf
 
Build, train, and deploy ML models at scale.pdf
Build, train, and deploy ML models at scale.pdfBuild, train, and deploy ML models at scale.pdf
Build, train, and deploy ML models at scale.pdf
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?
 

Último

AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 

Último (20)

AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 

What are the Unique Challenges and Opportunities in Systems for ML?

  • 1. What are the Unique Challenges and Opportunities in Systems for ML? Matei Zaharia
  • 2. 😀 🙂 AI is going to change all of computing! AI Researcher Systems Researcher
  • 3. 😀 🙂 It’s intelligent and you don’t need to program anymore and you just differentiate things... AI Researcher Systems Researcher
  • 4. 😀 🙂 How does it affect your research field? AI Researcher Systems Researcher
  • 5. 😀 🙂 How does it affect your research field? AI Researcher Systems Researcher Umm, I figured out a way to shave off some system calls!
  • 6. 😀 🙂 How does it affect your research field? AI Researcher Networking Researcher I came up with a new congestion control scheme
  • 7. 😐 🙂AI Researcher Networking Researcher I came up with a new congestion control scheme
  • 8. Motivation ML workloads can certainly influence a lot of systems, but what are the unique research challenges they raise? Turns out there are a lot! ML is very different from traditional software, and we should look at how
  • 9. My Perspective Research lab focused on infrastructure for usable machine learning Data & ML platform for 2000+ orgs
  • 10. How Does ML Differ from Traditional Software? Traditional Software Goal: meet a functional specification Quality depends only on application code Mostly deterministic Machine Learning Goal: optimize a metric (e.g. accuracy) Quality depends on input data and tuning parameters Stochastic
  • 11. Some Interesting Opportunities ML Platforms: software for managing and productionizing ML Data-oriented model training, QA and debugging tools Optimizations leveraging the stochastic nature of ML
  • 13. The ML Inference Bottleneck Inference cost is often 100x higher than training overall, and greatly limits deployments Example: processing 1 video stream in real time with CNNs requires a $1000 GPU
  • 14. Inference Optimization in NoScope Idea: optimize execution of ML models for a specific application or query • Model specialization: train a small DNN to recognize the specific class in the dataset (e.g. “buses in street video”) • Query optimization: tune a cascade of models to achieve a target accuracy Target Model Specialized Model Dataset User Query
  • 15. NoScope Results VLDB ‘17, github.com/stanford-futuredata/noscope
  • 16. Optimizing ML + SQL in BlazeIt [Kang et al, CIDR 2019] Object Detection DNN Frames from Video Query Plan with Specialized DNNs Resnet 50 SQL Query
  • 17. BlazeIt Optimizations Accelerate approximate queries by using specialized model’s output as a control variate for sampling E.g.: find average # of cars/frame Use specialized models to sort frames by likelihood of matching query, then run full model E.g.: SELECT * FROM frames WHERE #(red buses) > 3 LIMIT 5 Aggregation Queries Limit Queries
  • 19. Quality Assurance for ML with Model Assertions
  • 20. Motivation ML applications fail in complex, hard-to-debug ways • Tesla cars crashing into lane dividers • Gender classification incorrect based on race How can we test and improve quality of ML apps?
  • 21. Model Assertions Predicates on input/output of an ML application (similar to software assertions) [Kang, Raghavan et al, NeurIPS MLSys 2018] Frame 1 Frame 2 Frame 3 assert(cars should not flicker in and out) Improved training (data selection & weak supervision) Runtime monitoring
  • 22. Example Assertions Problem Domain Assertion Video analytics Objects should not flicker in and out across frames Autonomous vehicles LIDAR and video object detectors should agree Heart rhythm classification Output class should not change frequently
  • 23. Using Model Assertions Inference time » Runtime monitoring » Corrective action Training time » Active learning » Weak supervision via correction rules
  • 24. Active Learning with Assertions: Can assertions help select data to label & train on? Key idea: new active learning algorithm samples data that is most likely to reduce # failing assertions
  • 25. Active Learning with Assertions: Can assertions help select data to label & train on? Using assertions for active learning improves model quality. Selection Method for 2000 New Labels mAP
  • 26. Weak Supervision with Assertions: Can assertions improve quality without human labeling? Key idea: consistency constraints API lets devs say which attributes should stay constant across outputs in a dataset E.g. “each tracked object should always have same class”, “each person should have consistent detected gender”
  • 27. Task Pretrained Weakly Supervised AV perception (mAP) 10.6 14.1 (+33%) Object detection (mAP) 34.4 49.9 (+45%) ECG (% accuracy) 70.7 72.1 (+2%) Weak Supervision with Assertions: Can assertions improve quality without human labeling?
  • 28. Model Quality After Retraining Retrained SSD ModelOriginal SSD Model [Kang, Raghavan et al, NeurIPS MLSys 2018]
  • 29. ML Platforms: Programming and Deployment Systems for ML
  • 30. ML at Industrial Scale Today, ML development is ad-hoc: • Hard to track experiments & metrics: users do it best-effort • Hard to reproduce results: won’t happen by default • Hard to share & deploy models: different dev & deploy stacks Each app takes months to build, and then needs to continuously be maintained!
  • 31. ML Platforms A new class of systems to manage the ML lifecycle Pioneered by company-specific platforms: Facebook FBLearner, Uber Michelangelo, Google TFX, etc +Standardize the data prep / training / deploy cycle: if you work with the platform, you get these! –Limited to a few algorithms or frameworks –Tied to one company’s infrastructure
  • 32. MLflow from Databricks Open source, open-interface ML platform (mlflow.org) • Works with any existing ML library and deployment service Project Project Spec your_code.py . . . log_param(“alpha”, 0.5) log_metric(“rmse”, 0.2) log_model(my_model) . . . Deps Params Tracking Server UI API Inference Code Bulk Scoring Cloud Serving Tools Deployment TargetsExperiment TrackingReproducible Projects REST API
  • 33. my_project/ ├── MLproject │ │ │ │ │ ├── conda.yaml ├── main.py └── model.py ... MLflow Projects: Reproducible Runs conda_env: conda.yaml entry_points: main: parameters: training_data: path lr: {type: float, default: 0.1} command: python main.py {training_data} {lr} $ mlflow run git://<my_project> mlflow.run(“git://<my_project>”, ...) Simple packaging format for code + dependencies
  • 34. Composing Projects r1 = mlflow.run(“ProjectA”, params) if r1 > 0: r2 = mlflow.run(“ProjectB”, …) else: r2 = mlflow.run(“ProjectC”, …) r3 = mlflow.run(“ProjectD”, r2)
  • 35. MLflow Tracking: Logging for ML Notebooks Local Apps Cloud Jobs Tracking Server UI API mlflow.log_param(“alpha”, 0.5) mlflow.log_metric(“accuracy”, 0.9) ... REST API
  • 37. Model Format ONNX Flavor Python Flavor Model Logic Batch Inference REST Serving Packaging Format . . . Testing & Debug Tools LIME TCAV Packages arbitrary code (not just model weights) MLflow Models: Packaging Models
  • 38. MLflow Community Growth 140 contributors from >50 companies since June 2018 850K downloads/month Major external contributions: • Docker & Kubernetes execution • R API • Integrations with PyTorch, H2O, HDFS, GCS, … • Plugin system
  • 39. Other ML-Specific Research Opportunities Data validation and monitoring (e.g. TFX Data Validation) Supervision-oriented systems (e.g. Snorkel, Overton) Leveraging the numeric nature of ML for optimization, security, etc (e.g. TASO, HogWild, SSP, federated ML)
  • 40. Conclusion Many systems problems specific to ML are not heavily studied in research • App lifecycle, data quality & monitoring, model QA, etc These are also major problems in practice! Follow DAWN’s research at dawn.cs.stanford.edu