This presentation covers an overview of Analytics and Machine learning. It also covers the Microsoft's contribution in Machine learning space. Azure ML Studio, a SaaS based portal to create, experiment and share Machine Learning Solutions to the external world.
2. • Who says What?
• Challenges
• Analytics Evolution
• Azure Machine Learning
• Process Flow
• AzureML in a nutshell
• Summary
• Resources
• Conclusion
Agenda
3. -- Wikipedia --
“Machine learning is a subfield of computer science that evolved from the study of
pattern recognition and computational learning theory in artificial intelligence”
-- Arthur Samuel, 1959 --
“Machine Learning is the field of study that gives computers the ability to learn without
being explicitly programmed.”
-- Tom Mitchell, Carnegie Mellon University --
“A computer program is said to learn from experience E with respect to some task T
and some performance measure P, if its performance on T, as measured by P, improves
with experience E.”
Who says What ML is?
4. - Analyze the Past data – Use the Past data
- Learn from Past data – to predict the future
Definition Simplified
7. Expensive
Isolated data
Huge data
Huge Computing Power
Consequences
• Lost opportunities
• Expensive operative
mistakes
Traditional
approach
• Guessing
• Rules of thumb
• Trial and error
Challenges
Tool chaos
Complexity
9. • Enables powerful cloud-based predictive analytics
• Helps to build, deploy and share advanced analytics solutions
• Browser based, Rapid Deployment
• Connects seamlessly with other Azure data-related services, including:
• Azure HDInsight (Big Data)
• Azure SQL Database, and
• Virtual Machines
Azure Machine Learning
10. • Browser based, no installation required
• Visual Composition with end to end support
• Rapid experimentation for Better model creation
• Best in class ML algorithms (Bing, Xbox)
• Easy to use models : just search and use
• Effective collaboration with anyone, anywhere
• Extensible support for R
• Quickly deployable – as Azure Web Service to ML API service
Features and Benefits
12. Key roles
Data scientist
A highly educated and skilled person who can solve complex data problems by employing deep
expertise in scientific disciplines (mathematics, statistics or computer science)
Data professional
A skilled person who creates or maintains data systems, data solutions, or implements
predictive modeling
Roles: Database Administrator, Database Developer, or BI Developer
Software developer
A skilled person who designs and develops programming logic, and can apply machine learning
to integrate predictive functionality into applications
15. Process flow
Define
Objective
I need to predict customer
profitability…
…to deliver targeted display
advertising on the company
eCommerce web site, to:
• Present relevant product
suggestions
• Increase sales and profitability
16. Process flow
Garbage in ► Garbage out
Datasets can be sourced from:
• Internal sources, i.e. operational systems, data warehouse, etc.
• External sources
• Different formats, i.e. relational, multidimensional, text, map-reduce
Combining datasets can enrich data
E.g., integrate internal data to external data like weather, or market intelligence data
Collect
Data
17. Process flow
• Transform to cleanse, reduce or reformat
• Isolate and flag abnormal data
• Appropriately substitute missing values
• Categorize continuous values into ranges
• Normalize continuous values between 0 and 1
• having the required data to begin with is important
When designing systems, give consideration to attributes that may be
required as inputs for future modeling, e.g. demographic data: Birth date,
gender, etc.
Prepare
Data
18. Process flow
This stage is iterative, and experimentation
is required which involves:
• Selecting a machine learning algorithm
• Defining inputs and outputs
• Optimizing by configuring algorithm parameters
Model evaluation is critical to determine:
• Accuracy, Reliability, Usefulness
Train
Models
Evaluate
Models
19. Process flow
First, add a scoring experiment
• Transformational logic is replaced with a re-usable transformation resource
• Training logic is replaced with a trained model
• Web service inputs and outputs are added
• Module properties can be parameterized
Publish the experiment to the gallery
• Learn from others by discovering experiments
• Contribute and showcase your experiments
Publish
20. Process flow
Each web service offers two methods:
• Request/Response Service (RRS) ► Low latency, highly scalable web service
• Batch Execution Service (BES) ► High volume, asynchronous scoring of
many records
Integrate
21. Hands on
https://studio.azureml.net/
Hands on
• Get the Data
• View it (Descriptive statistics)
• Update missing values
• Remove unnecessary data
• Split data for testing
• Train Model
• Select algorithm
• Score Model
• Compare 2 algorithms
• Two class Logistic regression
• Two class booster
• Publish as WebService
• Test the Service with a sample data
23. Azure Portal
Azure Ops Team
ML Studio
Data Professional
HDInsight
Azure Storage
Desktop Data
Azure Portal &
ML API service
Azure Ops Team
Power BI/DashboardsMobile AppsWeb Apps
ML API service Application Developer
AzureML in a nutshell
24. Azure Portal
Azure Ops Team
ML Studio
Data Scientist
HDInsight
Azure Storage
Desktop Data
Azure Portal &
ML API service
Azure Ops Team
Power BI/DashboardsMobile AppsWeb Apps
ML API service Developer
ML Studio
and the Data Professional
• Access and prepare data
• Create, test and train models
• Collaborate
• One click to stage for
production via the API service
AzurePortal&MLAPIservice
and the Azure Ops Team
• Create ML Studio workspace
• Assign storage account(s)
• Monitor ML consumption
• See alerts when model is ready
• Deploy models to web service
ML API service and the Application Developer
• Tested models available as a URL that can be called from any endpoint
Business users easily access results
from anywhere, on any device
AzureML in a nutshell
26. Machine Learning is a subfield of computer science and statistics that
deals with the construction and study of systems that can learn from
data
Azure Machine Learning key attributes:
Fully managed ► No hardware or software to buy
Integrated ► Drag, drop, connect and configure
Best-in-class algorithms ► Proven solutions from Xbox and Bing
R built in ► Use over 400 R packages, or bring your own R or Python code
Deploy in minutes ► Operationalize with a click
Machine Learning is now approachable to developers
Summary
27. Quick and easy extensibility with cloud functions such as
Power BI, Hadoop (Azure HDInsight) and cloud storage
Summary