SlideShare una empresa de Scribd logo
1 de 31
Descargar para leer sin conexión
Introduction to Machine Learning
Presented By:-
Pranay Rajput
Software Consultant- AI/ML
Lack of etiquette and manners is a huge turn off.
KnolX Etiquettes
Punctuality
Respect Knolx session timings,
you are requested not to join
sessions after a 5 minutes
threshold post the session start
time.
Feedback
Make sure to submit a
constructive feedback for all
sessions as it is very helpful for
the presenter.
Silent Mode
Keep your mobile devices in
silent mode, feel free to move
out of session in case you need
to attend an urgent call.
Avoid Disturbance
Avoid unwanted chit chat during
the session.
• Introduction
• Basics
• Classification
• Regression
• Clustering
• Distance Metrics
• Use-Cases
Agenda
What is AI?
In computer science, the term artificial intelligence (AI) refers to any human-like intelligence exhibited by a
computer, robot, or other machine. In popular usage, artificial intelligence refers to the ability of a computer or
machine to mimic the capabilities of the human mind—learning from examples and experience, recognizing
objects, understanding and responding to language, making decisions, solving problems—and combining these
and other capabilities to perform functions a human might perform, such as greeting a hotel guest or driving a
car.
What is ML?
A computer program is said to learn from experience (E) with some class of tasks (T) and a performance
measure (P) if its performance at tasks in T as measured by P improves with E.
Terminology
• Features– The number of features or distinct traits that can be used to describe each item
in a quantitative manner.
• Samples – A sample is an item to process (e.g. classify). It can be a document, a picture,
a sound, a video, a row in database or CSV file, or whatever you can describe with a fixed
set of quantitative traits.
• Feature vector – It is an n-dimensional vector of numerical features that represent some
object.
• Feature extraction – Transforms the data in the high-dimensional space to a space of
fewer dimensions.
• Training/Evolution set – Set of data to discover potentially predictive relationships.
Let’s dig deep into it...
What do you mean by
Apple
WorkFlow
Supervised vs Unsupervised vs Reinforcement
• Classification: Classification is a type of supervised machine learning algorithm. For any
given input, the classification algorithms help in the prediction of the class of the output variable.
There can be multiple types of classifications like binary classification, multi-class classification,
etc.
• Regression: Regression is a type of supervised machine learning algorithm.It predicts
continuous valued output.The Regression analysis is the statistical model which is used to predict
the numeric data instead of labels.
• Clustering: Clustering is a type of unsupervised machine learning algorithm. It is used to
group data points having similar characteristics as clusters. Ideally, the data points in the same
cluster should exhibit similar properties and the points in different clusters should be as dissimilar
as possible.
Techniques
➢ Classify a document into a predefined category.
➢ Documents can be text, images
➢ Popular one is Naive Bayes Classifier.
➢ Steps:
– Step1 : Train the program (Building a Model) using a training set with a
category for e.g. sports, cricket, news,
– Classifier will compute probability for each word, the
probability that it makes a document belong to each of
considered categories
– Step2 : Test with a test data set against this Model
Classification
● It is a measure of the relation between the mean value of one variable (e.g.output) and
corresponding values of other variables (e.g. time and cost).
● Regression analysis is a statistical process for estimating the relationships among
variables.
● Regression means to predict the output value using training data.
Regression
• Clustering is the task of grouping a set of objects in such a way that objects in the
same group (called a cluster) are more similar to each other.
• For e.g. these keywords
– “man’s shoe”
– “women’s shoe”
– “women’s t-shirt”
– “man’s t-shirt”
– can be cluster into 2 categories “shoe” and “t-shirt” or
“man” and “women”
• Popular ones are K-means clustering
and Hierarchical clustering
Clustering
• Method of cluster analysis which seeks to build a hierarchy of clusters.
• There can be two strategies:-
– Agglomerative:
• This is a "bottom up" approach: each observation starts in its own cluster, and pairs of clusters are
merged as one moves up the hierarchy.
• Time complexity is O(n^3)
– Divisive:
• This is a "top down" approach: all observations
start in one cluster, and splits are performed
recursively as one moves down the hierarchy.
• Time complexity is O(2^n)
Hierarchical clustering
Partitional clustering decomposes a data set into a set of disjoint clusters. Given a data set of N
points, a partitioning method constructs K (N ≥ K) partitions of the data, with each partition
representing a cluster. That is, it classifies the data into K groups by satisfying the following
requirements: (1) each group contains at least one point, and (2) each point belongs to
exactly one group.
Partitional Clustering
• Example of partitional clustering.
• Partition n observations into k clusters in which each observation belongs to the cluster
with the nearest mean, serving as a prototype of the cluster.
K-means Clustering
Some distance metrics used in machine
learning models
To define Minkowski Distance,we need to learn some mathematical terms.They include the followings:
● Vector space: It is a collection of objects called vectors that can be added together and multiplied by numbers (also called
scalars).
● Norm: A norm is a function that assigns a strictly positive length to each vector in a vector space (The only exception is the zero
vector whose length is zero).It is usually represented as ∥x∥.
● Normed vector space : It is a vector space over the real or complex numbers on which a norm is defined.
Minkowski distance is defined as the similarity metric between two points in the normed vector space.It is represented by the formula,
Minkowski Distance
It represents also a generalized metric that includes Euclidean and Manhattan distance.We can manipulate the
value of p and calculate the distance in three different ways which is also known as Lp form.
p = 1, Manhattan Distance
p = 2, Euclidean Distance
p = ∞, Chebyshev Distance
Where it is used- Minkowski distance is frequently used when the variables of interest are measured on ratio
scales with an absolute zero value.
Manhattan Distance
We use Manhattan distance, also known as city block distance, or taxicab geometry if we
need to calculate the distance between two data points in a grid-like path. Manhattan distance
metric can be understood with the help of a simple example.
In the above picture, imagine each cell to be a building, and the grid lines to be roads. Now if I
want to travel from Point A to Point B marked in the image and follow the red or the yellow
path. We see that the path is not straight and there are turns. In this case, we use the
Manhattan distance metric to calculate the distance walked.
Note: - In high dimensional data Manhattan distance is preferred. Also, if you are calculating errors, Manhattan Distance is
useful when you want to emphasis on outliers due to its linear nature.
We can get the equation for Manhattan distance by substituting p = 1 in the
Minkowski distance formula. The formula is:-
Euclidean Distance
Euclidean distance is the straight line distance between 2 data points in a
plane.It is calculated using the Minkowski Distance formula by setting ‘p’
value to 2, thus, also known as the L2 norm distance metric. The formula
is:-
Note:- Euclidean distance does not perform well for high dimensional data. This occurs due to the ‘curse of dimensionality’.
Hamming Distance:
Hamming distance is a metric for comparing two binary data strings. While comparing two binary strings of equal length,
Hamming distance is the number of bit positions in which the two bits are different.
The Hamming distance between two strings, a and b is denoted as d(a,b).
In order to calculate the Hamming distance between two strings, and, we perform their XOR operation, (a⊕ b), and then count
the total number of 1s in the resultant string.
Suppose there are two strings 11011001 and 10011101.
11011001 ⊕ 10011101 = 01000100. Since, this contains two 1s,
the Hamming distance, d(11011001, 10011101) = 2.
Cosine Distance & Cosine Similarity:
Cosine distance & Cosine Similarity metric is mainly used to find similarities between two data points. As the cosine distance
between the data points increases, the cosine similarity, or the amount of similarity decreases, and vice versa. Thus, Points
closer to each other are more similar than points that are far away from each other. Cosine similarity is given by Cos θ, and
cosine distance is 1- Cos θ. Example:-
In the above image, there are two data points shown in blue, the angle between these points is 90 degrees, and Cos 90 = 0.
Therefore, the shown two points are not similar, and their cosine distance is 1 — Cos 90 = 1.
Machine learning: when ?
➢ Learning is useful when:
- Human expertise does not exist (navigating on Mars),
- Humans are unable to explain their expertise (speech recognition)
- Solution changes in time (routing on a computer network)
- Solution needs to be adapted to particular cases (user biometrics)
Example: It is easier to write a program that learns to play checkers or
backgammon well by self-play rather than converting the expertise of a master
player to a program.
● Machine Translation (Language Translation)
● Image Search (Similarity)
● Recommendation System : Amazon prime,Netflix
● Classification : Google News,Spam Email Detection
● Text Summarization - Google News
● Rating a Review/Comment: Yelp
● Fraud detection : Credit card Providers
● Decision Making : e.g. Bank/Insurance sector
● Sentiment Analysis : Crime Detection
● Speech Recognition – Alexa,Siri,Cortana,Google Home
● Face Detection – Facebook’s Photo tagging
Use-Cases
● Weka
● Carrot2
● Gate
● OpenNLP
● LingPipe
● Stanford NLP
● Mallet – Topic Modelling
● Gensim – Topic Modelling (Python)
● Apache Mahout
● MLlib – Apache Spark
● scikit-learn - Python
● LIBSVM : Support Vector Machines
and many more...
Popular Frameworks/Tools
Thank You !

Más contenido relacionado

La actualidad más candente

Neural collaborative filtering-발표
Neural collaborative filtering-발표Neural collaborative filtering-발표
Neural collaborative filtering-발표hyunsung lee
 
Highly accurate log skew normal
Highly accurate log skew normalHighly accurate log skew normal
Highly accurate log skew normalcsandit
 
Catching co occurrence information using word2vec-inspired matrix factorization
Catching co occurrence information using word2vec-inspired matrix factorizationCatching co occurrence information using word2vec-inspired matrix factorization
Catching co occurrence information using word2vec-inspired matrix factorizationhyunsung lee
 
Methodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesMethodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesijsc
 
Data Science - Part XV - MARS, Logistic Regression, & Survival Analysis
Data Science -  Part XV - MARS, Logistic Regression, & Survival AnalysisData Science -  Part XV - MARS, Logistic Regression, & Survival Analysis
Data Science - Part XV - MARS, Logistic Regression, & Survival AnalysisDerek Kane
 
Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...
Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...
Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...IOSR Journals
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality ReductionSaad Elbeleidy
 
Direction of arrival estimation using music algorithm
Direction of arrival estimation using music algorithmDirection of arrival estimation using music algorithm
Direction of arrival estimation using music algorithmeSAT Publishing House
 
Direction of arrival estimation using music algorithm
Direction of arrival estimation using music algorithmDirection of arrival estimation using music algorithm
Direction of arrival estimation using music algorithmeSAT Journals
 
Interpolation wikipedia
Interpolation   wikipediaInterpolation   wikipedia
Interpolation wikipediahort34
 
Subspace based doa estimation techniques
Subspace based doa estimation techniquesSubspace based doa estimation techniques
Subspace based doa estimation techniqueseSAT Journals
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networksananth
 
Paper id 26201483
Paper id 26201483Paper id 26201483
Paper id 26201483IJRAT
 
Introduction to multiple signal classifier (music)
Introduction to multiple signal classifier (music)Introduction to multiple signal classifier (music)
Introduction to multiple signal classifier (music)Milkessa Negeri
 

La actualidad más candente (16)

Neural collaborative filtering-발표
Neural collaborative filtering-발표Neural collaborative filtering-발표
Neural collaborative filtering-발표
 
Highly accurate log skew normal
Highly accurate log skew normalHighly accurate log skew normal
Highly accurate log skew normal
 
Catching co occurrence information using word2vec-inspired matrix factorization
Catching co occurrence information using word2vec-inspired matrix factorizationCatching co occurrence information using word2vec-inspired matrix factorization
Catching co occurrence information using word2vec-inspired matrix factorization
 
Lda
LdaLda
Lda
 
Methodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesMethodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniques
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
Data Science - Part XV - MARS, Logistic Regression, & Survival Analysis
Data Science -  Part XV - MARS, Logistic Regression, & Survival AnalysisData Science -  Part XV - MARS, Logistic Regression, & Survival Analysis
Data Science - Part XV - MARS, Logistic Regression, & Survival Analysis
 
Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...
Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...
Analytical Review of Feature Extraction Techniques for Automatic Speech Recog...
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Direction of arrival estimation using music algorithm
Direction of arrival estimation using music algorithmDirection of arrival estimation using music algorithm
Direction of arrival estimation using music algorithm
 
Direction of arrival estimation using music algorithm
Direction of arrival estimation using music algorithmDirection of arrival estimation using music algorithm
Direction of arrival estimation using music algorithm
 
Interpolation wikipedia
Interpolation   wikipediaInterpolation   wikipedia
Interpolation wikipedia
 
Subspace based doa estimation techniques
Subspace based doa estimation techniquesSubspace based doa estimation techniques
Subspace based doa estimation techniques
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
 
Paper id 26201483
Paper id 26201483Paper id 26201483
Paper id 26201483
 
Introduction to multiple signal classifier (music)
Introduction to multiple signal classifier (music)Introduction to multiple signal classifier (music)
Introduction to multiple signal classifier (music)
 

Similar a Introduction to machine learning

CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3Nandhini S
 
8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithmLaura Petrosanu
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningAmAn Singh
 
Machine learning session8(svm nlp)
Machine learning   session8(svm nlp)Machine learning   session8(svm nlp)
Machine learning session8(svm nlp)Abhimanyu Dwivedi
 
cs 601 - lecture 1.pptx
cs 601 - lecture 1.pptxcs 601 - lecture 1.pptx
cs 601 - lecture 1.pptxGopalPatidar13
 
Unsupervised Learning in Machine Learning
Unsupervised Learning in Machine LearningUnsupervised Learning in Machine Learning
Unsupervised Learning in Machine LearningPyingkodi Maran
 
DMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringDMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringPier Luca Lanzi
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Maninda Edirisooriya
 
SVM & KNN Presentation.pptx
SVM & KNN Presentation.pptxSVM & KNN Presentation.pptx
SVM & KNN Presentation.pptxMohamedMonir33
 
K means clustering
K means clusteringK means clustering
K means clusteringkeshav goyal
 
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkOBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkshesnasuneer
 
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkOBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkshesnasuneer
 
Cluster Analysis
Cluster Analysis Cluster Analysis
Cluster Analysis Baivab Nag
 
Unsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and AssumptionsUnsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and Assumptionsrefedey275
 
AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdf
AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdfAIML_UNIT 2 _PPT_HAND NOTES_MPS.pdf
AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdfMargiShah29
 
07 dimensionality reduction
07 dimensionality reduction07 dimensionality reduction
07 dimensionality reductionMarco Quartulli
 

Similar a Introduction to machine learning (20)

CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Neural nw k means
Neural nw k meansNeural nw k means
Neural nw k means
 
Machine learning session8(svm nlp)
Machine learning   session8(svm nlp)Machine learning   session8(svm nlp)
Machine learning session8(svm nlp)
 
cs 601 - lecture 1.pptx
cs 601 - lecture 1.pptxcs 601 - lecture 1.pptx
cs 601 - lecture 1.pptx
 
Unsupervised Learning in Machine Learning
Unsupervised Learning in Machine LearningUnsupervised Learning in Machine Learning
Unsupervised Learning in Machine Learning
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
DMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringDMTM Lecture 11 Clustering
DMTM Lecture 11 Clustering
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
 
SVM & KNN Presentation.pptx
SVM & KNN Presentation.pptxSVM & KNN Presentation.pptx
SVM & KNN Presentation.pptx
 
K means clustering
K means clusteringK means clustering
K means clustering
 
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkOBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
 
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkOBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
 
Cluster Analysis
Cluster Analysis Cluster Analysis
Cluster Analysis
 
K mean-clustering
K mean-clusteringK mean-clustering
K mean-clustering
 
Unsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and AssumptionsUnsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and Assumptions
 
AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdf
AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdfAIML_UNIT 2 _PPT_HAND NOTES_MPS.pdf
AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdf
 
working with python
working with pythonworking with python
working with python
 
07 dimensionality reduction
07 dimensionality reduction07 dimensionality reduction
07 dimensionality reduction
 

Más de Knoldus Inc.

Robusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxRobusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxKnoldus Inc.
 
Optimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxOptimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxKnoldus Inc.
 
Azure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxAzure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxKnoldus Inc.
 
CQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxCQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxKnoldus Inc.
 
ETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationKnoldus Inc.
 
Scripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationScripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationKnoldus Inc.
 
Getting started with dotnet core Web APIs
Getting started with dotnet core Web APIsGetting started with dotnet core Web APIs
Getting started with dotnet core Web APIsKnoldus Inc.
 
Introduction To Rust part II Presentation
Introduction To Rust part II PresentationIntroduction To Rust part II Presentation
Introduction To Rust part II PresentationKnoldus Inc.
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Configuring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAConfiguring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAKnoldus Inc.
 
Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)Knoldus Inc.
 
Azure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptxAzure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptxKnoldus Inc.
 
The Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and KotlinThe Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and KotlinKnoldus Inc.
 
Data Engineering with Databricks Presentation
Data Engineering with Databricks PresentationData Engineering with Databricks Presentation
Data Engineering with Databricks PresentationKnoldus Inc.
 
Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Knoldus Inc.
 
NoOps - (Automate Ops) Presentation.pptx
NoOps - (Automate Ops) Presentation.pptxNoOps - (Automate Ops) Presentation.pptx
NoOps - (Automate Ops) Presentation.pptxKnoldus Inc.
 
Mastering Distributed Performance Testing
Mastering Distributed Performance TestingMastering Distributed Performance Testing
Mastering Distributed Performance TestingKnoldus Inc.
 
MLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptxMLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptxKnoldus Inc.
 
Introduction to Ansible Tower Presentation
Introduction to Ansible Tower PresentationIntroduction to Ansible Tower Presentation
Introduction to Ansible Tower PresentationKnoldus Inc.
 
CQRS with dot net services presentation.
CQRS with dot net services presentation.CQRS with dot net services presentation.
CQRS with dot net services presentation.Knoldus Inc.
 

Más de Knoldus Inc. (20)

Robusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxRobusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptx
 
Optimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxOptimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptx
 
Azure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxAzure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptx
 
CQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxCQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptx
 
ETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake Presentation
 
Scripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationScripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics Presentation
 
Getting started with dotnet core Web APIs
Getting started with dotnet core Web APIsGetting started with dotnet core Web APIs
Getting started with dotnet core Web APIs
 
Introduction To Rust part II Presentation
Introduction To Rust part II PresentationIntroduction To Rust part II Presentation
Introduction To Rust part II Presentation
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Configuring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAConfiguring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRA
 
Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)
 
Azure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptxAzure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptx
 
The Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and KotlinThe Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and Kotlin
 
Data Engineering with Databricks Presentation
Data Engineering with Databricks PresentationData Engineering with Databricks Presentation
Data Engineering with Databricks Presentation
 
Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)
 
NoOps - (Automate Ops) Presentation.pptx
NoOps - (Automate Ops) Presentation.pptxNoOps - (Automate Ops) Presentation.pptx
NoOps - (Automate Ops) Presentation.pptx
 
Mastering Distributed Performance Testing
Mastering Distributed Performance TestingMastering Distributed Performance Testing
Mastering Distributed Performance Testing
 
MLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptxMLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptx
 
Introduction to Ansible Tower Presentation
Introduction to Ansible Tower PresentationIntroduction to Ansible Tower Presentation
Introduction to Ansible Tower Presentation
 
CQRS with dot net services presentation.
CQRS with dot net services presentation.CQRS with dot net services presentation.
CQRS with dot net services presentation.
 

Último

React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sectoritnewsafrica
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 

Último (20)

React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 

Introduction to machine learning

  • 1. Introduction to Machine Learning Presented By:- Pranay Rajput Software Consultant- AI/ML
  • 2. Lack of etiquette and manners is a huge turn off. KnolX Etiquettes Punctuality Respect Knolx session timings, you are requested not to join sessions after a 5 minutes threshold post the session start time. Feedback Make sure to submit a constructive feedback for all sessions as it is very helpful for the presenter. Silent Mode Keep your mobile devices in silent mode, feel free to move out of session in case you need to attend an urgent call. Avoid Disturbance Avoid unwanted chit chat during the session.
  • 3. • Introduction • Basics • Classification • Regression • Clustering • Distance Metrics • Use-Cases Agenda
  • 4. What is AI? In computer science, the term artificial intelligence (AI) refers to any human-like intelligence exhibited by a computer, robot, or other machine. In popular usage, artificial intelligence refers to the ability of a computer or machine to mimic the capabilities of the human mind—learning from examples and experience, recognizing objects, understanding and responding to language, making decisions, solving problems—and combining these and other capabilities to perform functions a human might perform, such as greeting a hotel guest or driving a car. What is ML? A computer program is said to learn from experience (E) with some class of tasks (T) and a performance measure (P) if its performance at tasks in T as measured by P improves with E.
  • 5.
  • 6. Terminology • Features– The number of features or distinct traits that can be used to describe each item in a quantitative manner. • Samples – A sample is an item to process (e.g. classify). It can be a document, a picture, a sound, a video, a row in database or CSV file, or whatever you can describe with a fixed set of quantitative traits. • Feature vector – It is an n-dimensional vector of numerical features that represent some object. • Feature extraction – Transforms the data in the high-dimensional space to a space of fewer dimensions. • Training/Evolution set – Set of data to discover potentially predictive relationships.
  • 7. Let’s dig deep into it... What do you mean by Apple
  • 8.
  • 10.
  • 11. Supervised vs Unsupervised vs Reinforcement
  • 12. • Classification: Classification is a type of supervised machine learning algorithm. For any given input, the classification algorithms help in the prediction of the class of the output variable. There can be multiple types of classifications like binary classification, multi-class classification, etc. • Regression: Regression is a type of supervised machine learning algorithm.It predicts continuous valued output.The Regression analysis is the statistical model which is used to predict the numeric data instead of labels. • Clustering: Clustering is a type of unsupervised machine learning algorithm. It is used to group data points having similar characteristics as clusters. Ideally, the data points in the same cluster should exhibit similar properties and the points in different clusters should be as dissimilar as possible. Techniques
  • 13. ➢ Classify a document into a predefined category. ➢ Documents can be text, images ➢ Popular one is Naive Bayes Classifier. ➢ Steps: – Step1 : Train the program (Building a Model) using a training set with a category for e.g. sports, cricket, news, – Classifier will compute probability for each word, the probability that it makes a document belong to each of considered categories – Step2 : Test with a test data set against this Model Classification
  • 14. ● It is a measure of the relation between the mean value of one variable (e.g.output) and corresponding values of other variables (e.g. time and cost). ● Regression analysis is a statistical process for estimating the relationships among variables. ● Regression means to predict the output value using training data. Regression
  • 15. • Clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar to each other. • For e.g. these keywords – “man’s shoe” – “women’s shoe” – “women’s t-shirt” – “man’s t-shirt” – can be cluster into 2 categories “shoe” and “t-shirt” or “man” and “women” • Popular ones are K-means clustering and Hierarchical clustering Clustering
  • 16. • Method of cluster analysis which seeks to build a hierarchy of clusters. • There can be two strategies:- – Agglomerative: • This is a "bottom up" approach: each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. • Time complexity is O(n^3) – Divisive: • This is a "top down" approach: all observations start in one cluster, and splits are performed recursively as one moves down the hierarchy. • Time complexity is O(2^n) Hierarchical clustering
  • 17. Partitional clustering decomposes a data set into a set of disjoint clusters. Given a data set of N points, a partitioning method constructs K (N ≥ K) partitions of the data, with each partition representing a cluster. That is, it classifies the data into K groups by satisfying the following requirements: (1) each group contains at least one point, and (2) each point belongs to exactly one group. Partitional Clustering
  • 18. • Example of partitional clustering. • Partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. K-means Clustering
  • 19.
  • 20. Some distance metrics used in machine learning models
  • 21. To define Minkowski Distance,we need to learn some mathematical terms.They include the followings: ● Vector space: It is a collection of objects called vectors that can be added together and multiplied by numbers (also called scalars). ● Norm: A norm is a function that assigns a strictly positive length to each vector in a vector space (The only exception is the zero vector whose length is zero).It is usually represented as ∥x∥. ● Normed vector space : It is a vector space over the real or complex numbers on which a norm is defined. Minkowski distance is defined as the similarity metric between two points in the normed vector space.It is represented by the formula, Minkowski Distance
  • 22. It represents also a generalized metric that includes Euclidean and Manhattan distance.We can manipulate the value of p and calculate the distance in three different ways which is also known as Lp form. p = 1, Manhattan Distance p = 2, Euclidean Distance p = ∞, Chebyshev Distance Where it is used- Minkowski distance is frequently used when the variables of interest are measured on ratio scales with an absolute zero value.
  • 23. Manhattan Distance We use Manhattan distance, also known as city block distance, or taxicab geometry if we need to calculate the distance between two data points in a grid-like path. Manhattan distance metric can be understood with the help of a simple example. In the above picture, imagine each cell to be a building, and the grid lines to be roads. Now if I want to travel from Point A to Point B marked in the image and follow the red or the yellow path. We see that the path is not straight and there are turns. In this case, we use the Manhattan distance metric to calculate the distance walked.
  • 24. Note: - In high dimensional data Manhattan distance is preferred. Also, if you are calculating errors, Manhattan Distance is useful when you want to emphasis on outliers due to its linear nature. We can get the equation for Manhattan distance by substituting p = 1 in the Minkowski distance formula. The formula is:-
  • 25. Euclidean Distance Euclidean distance is the straight line distance between 2 data points in a plane.It is calculated using the Minkowski Distance formula by setting ‘p’ value to 2, thus, also known as the L2 norm distance metric. The formula is:- Note:- Euclidean distance does not perform well for high dimensional data. This occurs due to the ‘curse of dimensionality’.
  • 26. Hamming Distance: Hamming distance is a metric for comparing two binary data strings. While comparing two binary strings of equal length, Hamming distance is the number of bit positions in which the two bits are different. The Hamming distance between two strings, a and b is denoted as d(a,b). In order to calculate the Hamming distance between two strings, and, we perform their XOR operation, (a⊕ b), and then count the total number of 1s in the resultant string. Suppose there are two strings 11011001 and 10011101. 11011001 ⊕ 10011101 = 01000100. Since, this contains two 1s, the Hamming distance, d(11011001, 10011101) = 2.
  • 27. Cosine Distance & Cosine Similarity: Cosine distance & Cosine Similarity metric is mainly used to find similarities between two data points. As the cosine distance between the data points increases, the cosine similarity, or the amount of similarity decreases, and vice versa. Thus, Points closer to each other are more similar than points that are far away from each other. Cosine similarity is given by Cos θ, and cosine distance is 1- Cos θ. Example:- In the above image, there are two data points shown in blue, the angle between these points is 90 degrees, and Cos 90 = 0. Therefore, the shown two points are not similar, and their cosine distance is 1 — Cos 90 = 1.
  • 28. Machine learning: when ? ➢ Learning is useful when: - Human expertise does not exist (navigating on Mars), - Humans are unable to explain their expertise (speech recognition) - Solution changes in time (routing on a computer network) - Solution needs to be adapted to particular cases (user biometrics) Example: It is easier to write a program that learns to play checkers or backgammon well by self-play rather than converting the expertise of a master player to a program.
  • 29. ● Machine Translation (Language Translation) ● Image Search (Similarity) ● Recommendation System : Amazon prime,Netflix ● Classification : Google News,Spam Email Detection ● Text Summarization - Google News ● Rating a Review/Comment: Yelp ● Fraud detection : Credit card Providers ● Decision Making : e.g. Bank/Insurance sector ● Sentiment Analysis : Crime Detection ● Speech Recognition – Alexa,Siri,Cortana,Google Home ● Face Detection – Facebook’s Photo tagging Use-Cases
  • 30. ● Weka ● Carrot2 ● Gate ● OpenNLP ● LingPipe ● Stanford NLP ● Mallet – Topic Modelling ● Gensim – Topic Modelling (Python) ● Apache Mahout ● MLlib – Apache Spark ● scikit-learn - Python ● LIBSVM : Support Vector Machines and many more... Popular Frameworks/Tools