Data science essentials in azure ml

Data Science
Essentials in Azure
ML
MOSTAFA ELZOGHBI
SR. TECHNICAL EVANGELIST – MICROSOFT
@MOSTAFAELZOGHBI

Session Objectives & Takeaways
Data Science basic concepts of regression, classification,
clustering and recommendations.
Azure ML studio platform capabilities

Data Science Involves
 Data science is about using data to make decision that drive actions.
 Data science process involves:
 Data selection
 Preprocessing
 Transformation
 Data Mining
 Delivering value from data : Interpretation and evaluation

Machine Learning Overview
 Machine Learning is a computing technique that has its origins in artificial intelligence (AI)
and statistics. Machine Learning solutions include:
 Classification - Predicting a Boolean true/false value for an entity with a given set of features.
 Regression - Predicting a real numeric value for an entity with a given set of features.
 Clustering - Grouping entities with similar features.
 Recommendation - Recommending an item to a user based on past behavior or preferences of
similar users (Recommender systems).

Classification
 We want to teach a computer how to recognize images of chairs?
 We will provide set of images to a computer and tell which is a chair and which
is not.
 A computer is supposed to learn to recognize chairs and which ones are chairs
and which are not chairs even from images that has not seen before.
 It is learning by example
 To build this experiment, we need training set and test set. We need as much
data as we can.
 Each observation is represented by a set of numbers (features)
 A training & test sets are set of vector numbers and the label is represented by
a number
 Example: if an image is a chair, label will be +1 otherwise -1

Classification Cont.
 Yes/No questions is the most basic
 Examples: Automatic handwriting recognition, speech recognition, biometrics,
document classification, spam detection, credit card fraud, predicting customer
churn….etc.
 Formally, given training set (xi,yi) for i=1…n, we want to create a classification model f
that can predict label y for a new x
 Machine learning algorithm creates function f(x) for you.
 The predicted value of y for a new x is simply the sigh of the function f(x)

Regression
 For predicting real-valued outcomes:
 How many customers will arrive at our website next week?
 How many TV’s will sell next year?
 Can we predict someone’s income from their click through information?
 Formally, given training set (Xi,Yi) for i=1…n, we want to create a regression model f
that can predict label y for a new x

Supervised Learning
 Classification and Regression are supervised learning problems
 “Supervised” means that the training data has ground truth labels to learn from
 (Supervised) classification often has +1 or -1 labels
 (Supervised) regression has numerical labels
 There are lots of supervised problems
 Supervised learning algorithm are much easier to evaluate than unsupervised ones

DEMO
Regression: Demand Estimation
(Bike Rental)

Clustering
 Clustering is an unsupervised learning problem
 “Unsupervised” means that the training set has no ground truth labels to learn from
 This means they are much harder to evaluate
 Clustering groups data entities based on their feature values.

Recommendation
 Recommender systems are machine learning solutions that match individuals to items
based on the preferences of other similar individuals, or other similar items that the
individual is already known to like.
 Recommendation is one of the most commonly used forms of machine learning.
 An example of recommender system: Netflix Contest
 Build a better recommender system from Netflix data
 Azure ML modules: split, train matchbox recommender,
Score matchbox recommender, Evaluate Recommender.

Recommendation Cont.
 Use Score Matchbox Recommender to generate predictions
 You can generate the following kinds of prediction:
 Item Recommendation: Predicts recommended items based on a given user.
 Related Items: Predicts recommended items based on a given item.
 Rating Prediction: Predicts ratings for given users and items.
 Related Users: Predicts users based on a given user.
 Use the Evaluate Recommender module to evaluate recommender performance.

Azure ML Platform Features
 Notebooks for writing code in Azure ML (Jupyter)
 Building Programmable component using Python & R
 Publish experiments as web services
 Save trained models for re-usability
 Projects for creating combined assets (Experiments, datasets, notebooks..etc).

References
 Free e-book “Azure Machine Learning”
 https://mva.microsoft.com/ebooks#9780735698178
 Azure Machine Learning documentation
 https://azure.microsoft.com/en-us/documentation/services/machine-
learning/
 Data Science and Machine Learning Essentials
 www.edx.org

Thank you
 Check out my blog for Azure ML articles:
www.MostafaElzoghbi.com
 Follow me on Twitter: @MostafaElzoghbi

Data science essentials in azure ml

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (19)

Destacado

Destacado (17)

Similar a Data science essentials in azure ml

Similar a Data science essentials in azure ml (20)

Más de Mostafa

Más de Mostafa (11)

Último

Último (20)

Data science essentials in azure ml

Notas del editor