ML.NET 1.0 release is the first major milestone of a great journey that started in May 2018 when we released ML.NET 0.1 as open source. ML.NET is an open-source and cross-platform machine learning framework for .NET developers. Using ML.NET, developers can leverage their existing tools and skillsets to develop and infuse custom AI into their applications by creating custom machine learning models for common scenarios like Sentiment Analysis, Recommendation, Image Classification and more.
“Automated ML” is a collection of new technologies from Microsoft to enhance the data science development process. Still in preview, Auto ML for ML.NET 1.0 will be demonstrated in a Deep Learning Virtual Machine running Windows Server 2016. Code examples are in C# and run in Visual Studio Community 2019.
This presentation is the second of four related to ML.NET and Automated ML. The presentation will be recorded with video posted to this YouTube Channel: http://bit.ly/2ZybKwI
8. “It has exquisite buttons …
with long sleeves …works for
casual as well as business
settings”{f(x) {f(x)
Machine Learning
“Programming the UnProgrammable”
10. ML.NET 1.0
Machine Learning framework for building custom ML Models
Custom ML made easy
Automated ML and Tools (Model Builder and CLI)
Proven at scale
Azure, Office, Windows
Extensible
TensorFlow, ONNX and Infer.NET
Cross-platform and open-source
Runs everywhere
15. How much is this car worth?
Machine Learning Problem Example
16. Model Creation Is Typically Time-Consuming
Mileage
Condition
Car brand
Year of make
Regulations
…
Parameter 1
Parameter 2
Parameter 3
Parameter 4
…
Gradient Boosted
Nearest Neighbors
SVM
Bayesian Regression
LGBM
…
Mileage Gradient Boosted Criterion
Loss
Min Samples Split
Min Samples Leaf
Others Model
Which algorithm? Which parameters?Which features?
Car brand
Year of make
17. Criterion
Loss
Min Samples Split
Min Samples Leaf
Others
N Neighbors
Weights
Metric
P
Others
Which algorithm? Which parameters?Which features?
Mileage
Condition
Car brand
Year of make
Regulations
…
Gradient Boosted
Nearest Neighbors
SVM
Bayesian Regression
LGBM
…
Nearest Neighbors
Model
Iterate
Gradient BoostedMileage
Car brand
Year of make
Car brand
Year of make
Condition
Model Creation Is Typically Time-Consuming
18. Which algorithm? Which parameters?Which features?
Iterate
Model Creation Is Typically Time-Consuming
19. Enter data
Define goals
Apply constraints
Output
Automated ML Accelerates Model Development
Input Intelligently test multiple models in parallel
Optimized model
20. Automated ML Capabilities
• Based on Microsoft Research
• Brain trained with several
million experiments
• Collaborative filtering and
Bayesian optimization
• Privacy preserving: No need
to “see” the data
21. Automated ML Capabilities
• ML Scenarios: Classification &
Regression, Forecasting
• Languages: Python SDK for
deployment and hosting for
inference – Jupyter notebooks
• Training Compute: Local
Machine, AML Compute, Data
Science Virtual Machine (DSVM),
Azure Databricks*
• Transparency: View run history,
model metrics, explainability*
• Scale: Faster model training
using multiple cores and parallel
experiments
* In Preview
23. Guardrails
Class imbalance
Train-Test split, CV, rolling CV
Missing value imputation
Detect high cardinality features
Detect leaky features
Detect overfitting
Model Interpretability / Feature Importance
31. Automated ML Customer Testimonials
• Press-coverage from
public preview:
• CNET
• VentureBeat
• PRNewswire
“I quite like your AutoML function. It gives me good results compared to
other libraries I tested before (tpot and auto-sklearn) that I believe was only
looking at scores and often gave me models that over-trained my data. And
of course the model from your suggested code is better.”
- Big oil company
“I will start with AutoML and use the algorithm that AutoML recommends to
further tune the model”
- Data Scientist
“I actually enjoy being able to use AutoML in a Jupyter notebook. The
DataRobot interface was nice for non-experts, but for someone like me, it
felt a bit basic.”
- Data Scientist