IAC 2024 - IA Fast Track to Search Focused AI Solutions
Machine Learning AND Deep Learning for OpenPOWER
1. OpenPOWER Webinar Series
Machine Learning and
Deep Learning 101
Clarisse Taaffe Hedglin
clarisse@us.ibm.com
Executive AI Architect
IBM Systems
2. 2
Please note IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice
and at IBM’s sole discretion.
Information regarding potential future products is intended to outline our general product direction and it should
not be relied on in making a purchasing decision.
The information mentioned regarding potential future products is not a commitment, promise, or legal
obligation to deliver any material, code or functionality. Information about potential future products may not be
incorporated into any contract.
The development, release, and timing of any future features or functionality described for our products remains
at our sole discretion.
Performance is based on measurements and projections using standard IBM benchmarks in a controlled
environment. The actual throughput or performance that any user will experience will vary depending upon
many factors, including considerations such as the amount of multiprogramming in the user’s job stream,
the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be
given that an individual user will achieve results similar to those stated here.
3. Session Objectives
Introducing Machine Learning
and Deep Learning (ML/DL) in
the context of AI and analytics
Understanding the iterative
nature of the workflow
Getting an overview of different
ML/DL algorithms
Interpreting models and
outcomes
4. • No governance
• No collaboration
• Limited complexity
How Customers Do Data Analytics Traditionally
Spreadsheets
• Broad rules and categories
• Not dynamic
Business Rules
• Hard to maintain
• Pre-set rules and
approaches
Homegrown
Applications
• Limited use of analytics
• Hard coded models that do
not apply to unique needs
• Slow response
Other Applications
6. Machine Learning Definition
6
“…an application of artificial
intelligence (AI) that provides systems
the ability to automatically learn and
improve from experience without
being explicitly programmed.”
7. 7
From Data to Actions
010101010101010111100010011001010111
0000000000010101010100000000000 111101011
11000 000000000000 111111 010101 101010
10101010100
Prescriptive
What should
we do ?
Descriptive
What Has
Happened?
Cognitive
Learn
Dynamically
Predictive
What Will
Happen?
ACTIONDATA
HUMAN INPUTS
<
< >
< >
>c
c
c
c >
8. Machine Learning Flow
Credit card transaction
Loan application
MRI image
House data
Fraudulent vs. legitimate
Approve vs. reject
Tumor benign vs. malignant
House appraisal value
Mathematical Function
Not a memorization caching system
Representing pattern by a mathematical function
Machine learning is just a bunch of math
prediction
9. Data – Estimate House Price
Every column except last is a feature
Last column is a label
This is a labeled data set
Sq Ft Bedroom Bathroom Price
2000 3 2 $350,000
1500 2 2 $280,000
2200 3 3 $400,000
… … … …
10. With labeled data, it is called supervised machine learning.
What if we don’t have labeled data?
–It’s unsupervised learning
–Objective is forming clusters based on data
Machine Learning Categories
Customer
Revenue
Customer
Profit
# of Online
Purchases
# of Store
Purchases
…
11. Unstructured, Landing, Exploration and Archive
Operational Data
Real-time Data Processing & Analytics
Transaction and
application data
Machine,
sensor data
Enterprise
content
Image, geospatial,
video
Social data
Third-party data
Information Integration & Governance
Data is Prerequisite to AI
Risk, Fraud
Chat bots,
personal
assistants
Supply Chain
Optimization
Dynamic
Pricing,
Recommenders
Behavior
Modeling
Vision,
Autonomous
Systems
12. 12
Iterative Process of Machine Learning
Define
Problem
Prepare
Data
Train
Model
Evaluate
Model
Fine-Tune
Model
Deploy
Model
13. Prior to performing machine
learning, identify foundational
knowledge of the problem you
are trying to solve.
Articulate use case, and identify
problem
What type of supporting data is
available?
How large is the data?
Identify response time/throughput
characteristics
Business Understanding
14. Predict a
Future Event
Segment Data
/ Detect
Anomalies
Determine
optimal
quantity,
price,
resource
allocation, or
best action
Understand
Past Activity
Discover
Insights in
Content
(text, images,
video)
Interact in
Natural
Language
Forecast
and Budget
based on
past activity
Supervised Unsupervised
Predictive: What will happen? Prescriptive:
What should
we do?
Descriptive:
What
happened?
Planning:
What is our
Plan?
NLPDeep Learning
Supervised
Common Patterns of Analytics Business Problems
Solving business problems with Data and AI
will utilize a combination of these analytics patterns
15. 15
Learning to Map Input to Output
Input Output Application
Customer behaviour Responder (1/0) Target marketing
Banking transactions Fraud (1/0) Fraud Detection
Call Data Record Churn (1/0) Customer retention
Image Object/Caption
(1,…1000)
Object Detection
Audio Text Transcript Speech recognition
Arabic English Machine translation
16. Visualization can give us an intuition about
the relationships between the data
Can be used to find clusters and patterns
in data
Can be used to understand distributions in
data (Univariate statistics)
Correlations and Bivariate statistics
Many graphical packages available
Prepare Data - Visualization
17. Feature engineering
Feature Selection
What to do with missing values
How to handle non numeric types
Feature A Feature B Feature C Derived
Feature A
Derived
Feature B
Target /
Label
0 Sentence
1
f1(A,B) f2(A,B) category1
NaN Sentence
2
category2
It has been stated, 80% of a data scientists time is spent in data preparation
… here’s why (even after data is identified/obtained)
More derived features (binary encoded)
Prepare Data
What is the best
representation of
sample data for more
accurate prediction?
18. An input image of size 256 x 256 contains 65,536 pixels
Each pixel is a feature in the feature vector
Highly computationally intensive due to large number of features
Image Data Representation
x11 x12 … x1, 65536
x21 x22 … x2, 65536
x31 x32 … x3, 65536
… … … …
xn1 xn2 xn3 xn, 65536
x1
x2
x3
19. Train / Test Data Split
Break input data
All data
Training
Cross Validation
Test
Random
Sample
1. Training data used to build models
2. Cross validation set used to evaluate
similar versions of models
3. Test set is used to inference results
to evaluate the quality of the model
20. Model Objective
Use training data to derive f(x)
so that:Cost (Actual - f(x))
Minimize ( Actual - Prediction )
Every computational iteration
analyzes the entire training data set
Process of minimizing this cost
function is called training
21. 21
Select Machine Learning Algorithms
Supervised Learning
Logistic Regression
Decision trees
Random forests
Neural Network (Deep Learning)
Bayesian Techniques
Support Vector Machines
Ensemble Methods
Markov Logic Network
Unsupervised Learning
K-Means
Hierarchical Clustering
Anomaly Detection
Density-based methods
Principal Component Analysis
22. Logistic Regression
Classification system
Medical image - tumor benign or
malignant
Credit card transaction - normal or
fraudulent
Customer churning - yes or no
Email - normal or spam
Language identification - English vs.
French vs. Spanishx1
x2
f(x)
Map input to one of the output categories
24. K-Means
x1
Randomly choose 3 data points as
centroids
For each data point, assign them to
one of the groups based on distance
from the centroids
Recompute centroids in each group
Reassign each data point
Repeat until convergence
Identifies patterns and clusters
28. 28
Deep learning (DL) framework invented and open sourced
by Google
Based on the notion of tensors which are multi dimensional
arrays of numbers
Implements a number of functions that are common to all
deep learning workflows (optimizers, back propagation,etc)
Programming Model : User defines neural network as a
graph, and then user “feeds” data to the network to either
train or perform inference
Most widely adopted platform in the DL universe as of today
One of the best documented frameworks
TensorFlow – a quick review
29. IBM Internal Use Only
PYTORCH – a quick review
2
Facebook’s framework for research
• Cousin of LUA based Torch framework,
but was rewritten to be tailored to
Python frontend
• Gaining popularity quickly for its ease
of use in R&D
• Supports dynamic computation graphs
• Based on Python with Numpy
compatibility
• Multi-GPU
• Easy to use, and supports standard
debug tools
30. 30
Transfer Learning
Useful because you can use
pretrained networks that might
have taken weeks to train
Useful because early layers are
trained to distinguish coarser
features
Typically final layer is removed
for a new problem and network is
retrained using new data
Applying learning acquired in one domain to a problem of a similar domain
31. IBM Internal Use Only
Definitions:
Scratch,
Finetune,
Feature Extract
31
• Scratch: training all
parameters yourself
starting from Random
• Finetune: training many
or most parameters
yourself but retain some
prior weights
• Feature Extract = Pre-
Trained: Only train the
vary last classification
layers
37. 37
Model Goal – Does it Help my Business ?
Targetsin%
Optimal
Random
Model
5010
100
100
Population in % (ranked in descending order of scores)
40
90
100
10
38. 38
Model Output Should be Easy to Consume
In a Call Center In a Mobile App
Deployment is the idea of making the insights available to application
developers, consumers, and business users
41. AI demands purpose-built
infrastructure throughout the journey
•Data preparation
•Model development environment
•Runtime environment
•Train, deploy and manage models
•Business KPI, production metrics
•Explainability and fairness
43. Machine Learning is computation
Understand the problem to solve
Supervised vs. unsupervised learning
Feature engineering - know your data
Iterative methodology
Greater accuracy with Deep Learning
Value gained in deployments
Infrastructure matters
In summary
44. Co-Creation Lab
Work side-by-side with IBM
data scientists, cloud and
infrastructure experts
Build Custom AI Solution
Plan for a pre-built Solution
Design a scalable, industrial
strength, training &
inferencing enterprise
platform
IBM Systems WW Client Experience Centers: AI Center of Competency
Fast Start / Design Workshops
Discovery Workshop
Overview Technologies
Understand Business
Challenges
Prioritize Use Cases
Showcases / Demos
AI Immersion
Experience
AI Consulting
Collateral / Assets
Field Support
Workshops & Co-Creation Labs
DeployAdoptLearn
IBM Systems ML/DL Hands-On
and Customizable
Combination of Lectures
And Labs
IBM or Customer Location
IBM Portfolio and
Open Source Tools
Contact us at aicoc@us.ibm.com
Learn it Design it Build it Together Use it
Fast Start / Design
Workshops