Computational model for artificial learning using formal concept analysis

Scientific Research Group in Egypt (SRGE),
http://www.egyptscience.net

Computational Model for Artificial
Learning Using Formal Concept Analysis
Mona Nagy ElBedwehy
Department of Mathematics, Faculty of Science, Damietta University
Email: monanagyelbedwehy@ymail.com

8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT

Agenda








2

Motivation
Contribution
Introduction
Background
 Classification Learning
 Formal Concept Analysis (FCA)
 Computational Models
Proposed Computational Model
Experimental Results and Discussions
Conclusion

Motivation
Artificial
intelligence
Embraces

Developing programs that
learn from past data

Understand mechanisms
embodied in human and
translating it into computer
programs

Artificial
Learning

3


Motivation –Cont.
 Many applications have a huge amount of data.
civil registration record

 Unfortunately, the ability of understanding and using it does
not keep track with its growth.
 Methods to generate a “summary” that represent a
conceptualization of the data set.
similarities among different citizens (City=Cairo, Gender= Female)

 Machine learning provides tools by which large quantities of
data can be automatically analyzed to overcome these
limitations and difficulties.
Analysis of urban area population increase
Marketing analysis of store departments
Formal Concept Analysis is a technique that enables resolution of such problems.
4


Contribution
 We formulate a computational model for
 binary classification process using formal concept
analysis.
 The classification rules are derived and applied
successfully for different study cases.

5


Introduction
 Machine learning is concerned (on the whole) with
concept learning and classification learning. The latter
is simply a generalization of the former.
 Classification permits predictions to be derived on the
basis of common properties of a class of entities or
phenomena.
 We will concern on the second approach of the AL
that is concerned on the classification learning.
Classification learning is a learning algorithm for
classifying unseen examples into predefined classes
based on a set of training examples.
6


Background
Classification Learning

 Generalize classes description by identifying the
common “core” characteristics of a set of training
objects to generate knowledge that will enable novel
objects to be identified as belonging to one of the
classes.
Classification Learning Algorithms

Statistical
Classification

Decision Trees

Neural Networks
Backpropagation

CART
7

Prism

SVM

Naïve
Bayes

Bayesian
Networks

Symbolic
algorithms

The proposed
model


Background
Formal Concept Analysis (1)
 Formal Concept Analysis (FCA) is a method used for
investigating and processing explicitely given information,
in order to allow for meaningful and comprehensive
interpretation.

 Proposed by Wille.
 An analysis of data.
 Structures of formal abstractions of concepts of human
thought.
 Formal emphasizes that the concepts are mathematical
objects, rather than concepts of mind.

8


Background
What is a Concept?
 A formal concept is constituted by two parts
A: set of
objects

9

Relations

B: set of
attributes

 Having a certain relation
 Every object belonging to this concept has all the attributes
in B.
 Every attribute belonging to this concept is shared by all
objects in A.
 A is called the concept's extent.
 B is called the concept's intent.

Background
Input

matrix specifying a set
of objects and attributes

Output

FCA

clusters of attributes
clusters of objects

 Object cluster is the set of all objects that share a common
subset of attributes.
 Attribute cluster is the set of all attributes shared by one of
the natural object clusters.
Duck
Goose
Parrot
…

Object_1
Objects
10

Object_2

relation

Has beak
Has feather
Has two legs
…
Attribute_1
Attribute_2
Attribute_3

Attributes


Background
Mathematical Definition of FCA
 A formal concept is defined within a context.
Definition 1 A formal context is (O, A, R) where O
(objects) and A (attributes), and R is a binary relation
between O and A.
 Equation (1) represents the set of attributes common
to the objects in M, while the set of objects which
have all attributes in B is represented as in Equation
(2).

M '  a  A o R a for all o  M 

(1)

(2)
B '  o  O o R a for all a  B
Definition 2 A formal concept of the context (O, A, R) is
a pair (M, B) of M  O , B  A , B’= M and M’=B.
11


Background
Computational Models
 Assume that the human brain is an information
processing system and that thinking is a form of
computing.
 Processes information by taking input and follows, a
step-by-step algorithm to get a specific output.
 The aim of computational modeling is to:
 increase our knowledge.
 improve our understanding of how the human
brain works
 build computer systems that can execute a given
task optimally and in the most efficient possible
way.
12


The Proposed
Computational Model
 Induces the classification rules which characterize
each class.
 In the proposed model:
1. Convert the given data into a binary data. Binary
data are data those unit can take on only two
possible values termed 0 and 1. We do extension
to the collection of attributes by new attributes to
represent the binary data.
2. Use FCA to describe the classification process, so
the following two functions are presented:

R (o)  a  A|  o, a   R

N  a   o  O |  o, a   R
13


The Proposed
Computational Model
Input Data

training data and a partition
of the training set OC1 , OC2

Convert the given data
into a binary data

Compute

k- Conjunction
for A

Add a to AC1 & R(a) to
D C1

R(a)  OC1  ø
R(a)  OC2= ø

Add a to AC2 & R(a) to
D C2

R(a)  OC1 = ø
R(a)  OC2  ø

Add R(a) with maximum no. of
objects in Dci to FCi
14

Find
R(a)

If R(a)  O  ø add attribute to FCi,
remove R(a) from O and Dci
While DCi ø


While O  ø

While last attribute in k- Conjunction is not reached

Experimental Results And
Discussions (1)

 The proposed model is applied to the following
datasets from the well-known UCI repository of
machine learning datasets that haven’t missing
attributes.
Table I. Datasets used in learning the concept classification
Dataset

Description

No. of
classes

No. of
attributes

No. of
instances

monk1

Monk’s Problem1

2

6

432

monk2

Monk’s Problem2

2

6

432

monk3

Monk’s Problem3

2

6

432

2

6

120

2

6

120

D1
D2

Acute Inflammations(Inflammation
of urinary bladder)
Acute Inflammations
(Nephritis of renal pelvis origin)

Note: Monk3 problem contains 5% noise data.
15


Discussions (2)
 Some performance indices are calculated for the
proposed model such as the following, where TP (true
positive), TN (true negative), FP (false positive), FN
(false negative).
TP
TN
Sensitivity  Recall 
,
Specificity 
,
TP  FN
TN  FP
Accuracy 

TP  TN
,
TP  FP  TN  FN

F  Measure 

2( Precision  Recall )
,
Precision  Recall

TP
PP  Precision 
,
TP  FP
TN
NP 
TN  FN

 The performance indices of the proposed model are
compared with Support Vector Machine (SVM) and
Classification and Regression Tree (CART).
16


Discussions (3)
Data

Table II. Comparison of classification accuracy: SVM, CART and the
proposed model (P. model)
SVM
monk1 66.20%
monk2 63.19%
monk3 78.70%
D1
100.0%
D2
100.0%

17

Correct
CART
83.33%
61.11%
97.22%
85.00%
100.0%

Incorrect
P. model
92.59%
63.66%
86.11%
100.0%
100.0%

SVM
CART P. model
33.80% 16.67% 0.00%
36.81% 38.89% 18.06%
21.30% 2.78%
7.18%
0.00% 15.00% 0.00%
0.00% 0.00%
0.00%

Misclassified
SVM
0.00%
0.00%
0.00%
0.00%
0.00%

CART
0.00%
0.00%
0.00%
0.00%
0.00%


P. model
7.41%
18.28%
6.71%
0.00%
0.00%

Discussions (4)

Table III. Comparison of classification accuracy: SVM, CART and the
proposed model (P. model) : misclassified assigned to majority class
Dataset

monk1
monk2
monk3
D1
D2

18

Correct Accuracy
SVM
CART
P. model
66.20%
83.33%
100.0%
63.19%
61.11%
76.39%
78.70%
97.22%
87.27%
100.0%
85.00%
100.0%
100.0%
100.0%
100.0%

Incorrect Accuracy
SVM
CART
P. model
33.80%
16.67%
0.00%
36.81%
38.89%
23.61%
21.30%
2.78%
12.73%
0.00%
15.00%
0.00%
0.00%
0.00%
0.00%


Discussions (5)

Table IV. Comparison of performance indices for SVM, CART and the
proposed model for monk1
TP

TN

FP

FN

Sen.

Spe.

PP

NP

FM

P. model

216

216

0

0

100%

100%

100%

100%

100%

SVM

137

149

67

79

63.43% 68.98% 67.16%

65.35%

65.24%

CART

168

192

24

48

77.78% 88.89% 87.50%

80.00%

82.35%

Table V. Comparison of performance indices for SVM, CART and the
TP

FP

FN

P. model

244

86

56

46

SVM

259

14

128

CART

19

TN

199

65

77

Sen.

Spe.

PP

NP

FM

84.14% 60.56% 81.33%

65.15%

82.71%

31

89.31%

9.86%

66.93%

31.11%

76.52%

91

68.62%

45.77% 72.10%

41.67%

70.32%


Discussions (6)
Table VI. Comparison of performance indices for SVM, CART and the
TP

TN

FP

FN

P. model

199

179

49

5

SVM

157

183

45

47

CART

204

216

12

0

Sen.

NP

FM

97.55% 78.51% 80.24%

97.28%

88.05%

76.96% 80.26% 77.72%

79.57%

77.34%

100%

97.14%

100%

Spe.

PP

94.74% 94.44%

Table VII. Comparison of performance indices for SVM, CART and the
proposed model for inflammation of urinary bladder
TP

FP

FN

Sen.

Spe.

PP

NP

FM

P. model

23

17

0

0

100%

100%

100%

100%

100%

SVM

23

17

0

0

100%

100%

100%

100%

100%

CART

20

TN

17

17

0

6

73.91%

100%

100%

73.91%%

85.00%


Discussions (7)
Table VIII. Comparison of performance indices for SVM, CART and the
proposed model for D2
TP

TN

FP

FN

Sen.

Spe.

PP

NP

FM

P. model

17

23

0

0

100%

100%

100%

100%

100%

SVM

17

23

0

0

100%

100%

100%

100%

100%

CART

17

23

0

0

100%

100%

100%

100%

100%

 ROC curve is a graphical plot that illustrates the
performance of a binary classifier system.
 ROC is created by plotting the fraction of true positives
out of the positives (sensitivity) vs. the fraction of false
positives out of the negatives (1-specificity)

21


Discussions (8)

ROC curve for Monk1

22

ROC curve for Monk2


Discussions (9)

ROC curve for Monk3

23

ROC curve for D1


Discussions (10)

ROC curve for D2
24


Conclusion
 Artificial learning is concerned with the classification
learning that is a supervised learning algorithm
embodied in the human mind.
 Proposed a computational model for classification
learning process which is described in terms of formal
concept analysis (FCA).
 The proposed model characterizes each class and predict
the class label of a new object.
 The performance of the proposed model has been
evaluated for the real world data which led to get on
classification rules from the training data that enable us
from predicting the outcome of unseen data in a test set.
 The proposed model has superior performance
comparing with CART and SVM.
25


Thank you

http://www.egyptscience.net
26

8th International Symposium Advances in Artificial Intelligence and Applications (AAIA'13)

Computational model for artificial learning using formal concept analysis

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (9)

Similar a Computational model for artificial learning using formal concept analysis

Similar a Computational model for artificial learning using formal concept analysis (20)

Más de Aboul Ella Hassanien

Más de Aboul Ella Hassanien (20)

Último

Último (20)

Computational model for artificial learning using formal concept analysis