А Вы знали, что практически для каждого проекта можно применить машинное обучение? И теперь для этого не нужно изучать новый язык программирования (как Python или R) и осваивать численные методы. В этом докладе я расскажу об основах машинного обучения и о том, как легко начать использовать его в своих .NET проектах с помощью ML.NET и других решений от Microsoft.
А Вы знали, что практически для каждого проекта можно применить машинное обучение? И теперь для этого не нужно изучать новый язык программирования (как Python или R) и осваивать численные методы. В этом докладе я расскажу об основах машинного обучения и о том, как легко начать использовать его в своих .NET проектах с помощью ML.NET и других решений от Microsoft.
2.
Тема доклада
Тема доклада
Тема доклада
.NET LEVEL UP
About me
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
Olia Gavrysh
Program Manager
Microsoft, .NET team
twitter: @oliagavrysh
3.
Тема доклада
Тема доклада
Тема доклада
Let me learn something about you…
5.
When you start Machine Learning
without calculus
6.
Тема доклада
Тема доклада
Тема доклада
.NET LEVEL UP
Agenda
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
1. Machine Learning crash course
2. Building ML model with for your .NET app
with ML.NET
10.
.NET LEVEL UP
Machine Learning
“Programming the UnProgrammable”
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
rooms, bedrooms, bathrooms
location, view, near school
footage
year built
garage, basement, patio
…
{f(x) {f(x)
14.
How ML works
ŷ = f(x)
Fcost = |y - ŷ| → 0
ŷ - our model
y – actual values (known answers)
Fcost - shows the difference between your
prediction and the actual values
16.
.NET LEVEL UP
Creating ML Model
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
Train Evaluate UseBuild
17.
.NET LEVEL UP
Building Model
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
Build
1. Upload Data
2. Prepare Data
3. Choose Algorithm
18.
.NET LEVEL UP
Training Model
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
Running the chosen
algorithm on the data.
Train
19.
.NET LEVEL UP
Evaluating Model
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
Calculate metrics that show how
good is the model using test data.
If not good – go back to Build phase.
Evaluate
All metrics: https://docs.microsoft.com/dotnet/machine-learning/resources/metrics
20.
.NET LEVEL UP
Consuming Model
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
Consume in your client
applications.
Use
21.
Building a model with
KYIV 2019 .NET CONFERENCE #1 IN UKRAINE
33.
Тема доклада
Тема доклада
Тема доклада
How long to train
*Dataset Size Dataset Type Avg. Time to train*
0 - 10 Mb Numeric and Text 10 sec
10 - 100 Mb Numeric and Text 10 min
100 - 500 Mb Numeric and Text 30 min
500 - 1 Gb Numeric and Text 60 min
1 Gb+ Numeric and Text 3 hour+
The exact time to train is a function of a few parameters like:
• The number of features or columns being used to predict
• The type of columns i.e. text vs. numeric
• The Type of machine learning task (e.g. regression vs. classification)
We have tested Model Builder with even 1TB dataset but building a high-quality model for
that size of dataset can take up to four days.
34.
1. Supervised and not supervised
2. Types of ML problems (https://docs.microsoft.com/en-us/dotnet/machine-
learning/tutorials/index)
Pictures for diff problem types: https://docs.microsoft.com/en-us/dotnet/machine-
learning/automate-training-with-model-builder
1. Data prep https://docs.microsoft.com/en-us/dotnet/machine-learning/how-to-
guides/prepare-data-ml-net
2. Parameters, hyperparameter, labels
3. Training set evaluation set. Cross validation
4. Success metrics: accuracy, … https://github.com/dotnet/machinelearning-
samples/blob/master/modelbuilder/readme.md#evaluate
All metrics here: https://docs.microsoft.com/en-us/dotnet/machine-
learning/resources/metrics
5. How to improve results: more time, more/better data (the last – that’s where data
scientists are needed)
35.
Price prediction step by step (one hot encoder, …): https://docs.microsoft.com/en-
us/dotnet/machine-learning/tutorials/predict-prices
36.
Difference between machine learning and AI:
• If it’s written in Python, it’s probably machine learning
• If it’s written in PowerPoint, it’s probably AI
Notas del editor
In our projects, we are often asked to solve very hard problems.
Many of these problems are relatively easy for a human to solve, but are difficult to program a computer to do.
Consider these examples…
How do you write code that determines whether a Netflix customer will renew their subscription or not? How do you code an application that can differentiate objects in real time from a video feed? How do you determine whether parts in a manufacturing line are defective?
How do you solve these problems when the only tool available for us is procedural code and traditional algorithms.?
The 3 problems we just saw all have 3 common properties:
One, they involve a repeated decision or evaluation process. Two, it is difficult or impossible to explicitly describe the solution. And three, you do have labeled data – or existing examples where you can describe the situation, and map it to the correct result.
This is where ML can help!
However you cannot use ML, if you do not have enough data – this is because you only get a model that is as good as the quality of your sampling data. Further, whenever you can use code to achieve your desired solution, there is no need to leverage ML as the computing resources needed can be very expensive.
.NET is a great tech stack for building a wide variety of applications. There is ASP.NET for web development, Xamarin for mobile development and with ML.NET we are trying to make .NET great for Machine Learning.
25
ML.NET provides tooling that makes it easy to use. In particular, 2 really valuable tools are: AutoML and Model Builder
What is AutoML? It is an API that accelerates model development for you. A lot of developers do not have the experience required to build or train Machine Learning models. With AutoML, the process of finding the best algorithm, is automated!
Model Builder on the other hand provides an easy to understand visual interface to build, train, and deploy custom machine learning models. Prior machine learning expertise is not required. It also supports AutoML
Rememeber depending on your data, giving you the error of each of the models and you can then decide which model to use. Most people just use the model with the least error.
And we will see it in action soon.
The data scientist is definitely not happy with 30%. They try a second time with a different algorithm and this time they maybe score 50%.
This guesswork of what features with what algorithm, goes on and on, until the data scientist finds one that performs the best with a score close to 100% as much as possible.
Often one gets tired and just goes with whichever model is good enough.
AutoML, however, replaces the data scientist’s repeated model selection effort.
What this means, is that even without a data science background, you can now build a model, by just leveraging AutoML.
All you need to do: is load you data define your goal (are you trying to classify objects into 2 categories or are you trying to predict a value based on past values?) and apply constraints for example (want performance of 70% and above)
In addition to ML.NET accelerating the model development phase, it also provides you with model explainability.
As you can see above, Model A’s sensitivity to different features is different from that of Model B. Trip distance was very important is model A while Trip time was most important in Model B.
Model explainability is very important because today there are so many people who build models and at the end are not sure which pieces of information were weighted heavily over the others by the resulting model.
BUT AutoML provides with both autoML and model explainability.
Now let us see AutoML in action!
In addition to ML.NET accelerating the model development phase, it also provides you with model explainability.
As you can see above, Model A’s sensitivity to different features is different from that of Model B. Trip distance was very important is model A while Trip time was most important in Model B.
Model explainability is very important because today there are so many people who build models and at the end are not sure which pieces of information were weighted heavily over the others by the resulting model.
BUT AutoML provides with both autoML and model explainability.
Los recortes son una forma práctica de recopilar diapositivas importantes para volver a ellas más tarde. Ahora puedes personalizar el nombre de un tablero de recortes para guardar tus recortes.
Crear un tablero de recortes
Compartir esta SlideShare
¿Odia los anuncios?
Consiga SlideShare sin anuncios
Acceda a millones de presentaciones, documentos, libros electrónicos, audiolibros, revistas y mucho más. Todos ellos sin anuncios.
Oferta especial para lectores de SlideShare
Solo para ti: Prueba exclusiva de 60 días con acceso a la mayor biblioteca digital del mundo.
La familia SlideShare crece. Disfruta de acceso a millones de libros electrónicos, audiolibros, revistas y mucho más de Scribd.
Parece que tiene un bloqueador de anuncios ejecutándose. Poniendo SlideShare en la lista blanca de su bloqueador de anuncios, está apoyando a nuestra comunidad de creadores de contenidos.
¿Odia los anuncios?
Hemos actualizado nuestra política de privacidad.
Hemos actualizado su política de privacidad para cumplir con las cambiantes normativas de privacidad internacionales y para ofrecerle información sobre las limitadas formas en las que utilizamos sus datos.
Puede leer los detalles a continuación. Al aceptar, usted acepta la política de privacidad actualizada.