This document discusses Naive Bayes classification. It begins by introducing classification and defining Naive Bayes as a simple probabilistic classifier based on applying Bayes' theorem with strong independent assumptions. It then provides an example of using Naive Bayes for classification, showing the learning and testing phases. It concludes that Naive Bayes is an intuitive and fast classification method that is widely used, particularly in fields like natural language processing.
3. Introduction
Classification:
In machine learning and statistics classification is the problem of
identifying to which of a set of categories a new observation belongs.
The individual observations are analyzed into a set of quantifiable
properties, known as various explanatory variables, features, etc.
These properties may variously be categorical (e.g. "A", "B", "AB" or
"O", for blood type), ordinal (e.g. "large", "medium" or "small"),
4. Naive-Bayes Classifier
An algorithm that implements classification, especially in a concrete
implementation, is known as a classifier.
A Naïve-Bayes classifier is a simple probabilistic classifier based on
applying Bayes' theorem with strong (naive) independent assumptions.
Named after Thomas Bayes ( 1702-1761), who proposed the Bayes
Theorem.
In simple terms, a Naïve-Bayes classifier assumes that the presence (or
absence) of a particular feature of a class is unrelated to the presence (or
absence) of any other feature, given the class variable.
5. Explanation:
Naïve-Bayes
Let,
X : Data sample whose class label is unknown.
H : Some hypothesis, such that X belongs to some class C.
P(H|X) : Probability that the hypothesis holds given the observed data
sample X.
P(H|X) is the posterior probability, of H conditioned on X.
In simple words, Data samples consists of fruits depending upon their
color and shape.
Suppose that ,
X : Red and round
H : Hypothesis that X is and apple.
P(H|X) reflects confidence that X is an apple having seen that X is Round
and Red.
6. Explanation:
Naïve-Bayes
P(H) is the prior probability of H.
For the data sample, this is the probability that it is an Apple.
(Regardless of how the data looks.)
P(X|H) is the posterior probability of X conditioned on H.
P(X) is the prior probability of X.
For the data sample, this is the probability that it is Red and Round.
Bayes’ Theorem is useful in determining the posterior probability, P(H|X).
from P(H),P(X)and P(X|H).
Bayes Rule:
P( X | H ) P( H ) Likelihood× Prior
p( H | X ) Posterior=
Evidence
P( X )
9. Instance
Test Phase
Given a new instance,
x’=(Outlook=Sunny, Temperature=Cool, Humidity=High,
Wind=Strong)
P(Outlook=Sunny|Play=Yes) = 2/9
P(Outlook=Sunny|Play=No) = 3/5
P(Temperature=Cool|Play=Yes) = 3/9
P(Temperature=Cool|Play==No) = 1/5
P(Huminity=High|Play=Yes) = 3/9
P(Huminity=High|Play=No) = 4/5
P(Wind=Strong|Play=Yes) = 3/9
P(Wind=Strong|Play=No) = 3/5
P(Play=Yes) = 9/14
P(Play=No) = 5/14
P(Yes|x’): *P(Sunny|Yes)P(Cool|Yes)P(High|Yes)P(Strong|Yes)]P(Play=Yes) = 0.0053
P(No|x’): *P(Sunny|No) P(Cool|No)P(High|No)P(Strong|No)]P(Play=No) = 0.0206
Given the fact P(Yes|x’) < P(No|x’), we label x’ to be “No”.
10. Conclusion
Naive Bayes is one of the simplest density estimation methods from
which we can form one of the standard classification methods in
machine learning.
Very easy to program and intuitive.
Fast to train and to use as a classifier.
Very easy to deal with missing attributes.
Very popular in fields such as computational linguistics/NLP.
Many successful applications, e.g., spam mail filtering
11. • References:
Data Mining :Concepts and Techniques – JiaweiHan, Micheline Kamber
Simon Fraser University.
Naïve-Bayes Classifier by Ke Chen - comp24111 Machine Learning.
Introduction to Baysian Learning - Ata Kaban, University of Birmingham .
Learning from Data 1 Naive Bayes - David Barber 2001-2004,Amos Storkey
Thank You !!