Automating Google Workspace (GWS) & more with Apps Script
Computational model for artificial learning using formal concept analysis
1. Scientific Research Group in Egypt (SRGE),
http://www.egyptscience.net
Computational Model for Artificial
Learning Using Formal Concept Analysis
Mona Nagy ElBedwehy
Department of Mathematics, Faculty of Science, Damietta University
Email: monanagyelbedwehy@ymail.com
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
4. Motivation –Cont.
Many applications have a huge amount of data.
civil registration record
Unfortunately, the ability of understanding and using it does
not keep track with its growth.
Methods to generate a “summary” that represent a
conceptualization of the data set.
similarities among different citizens (City=Cairo, Gender= Female)
Machine learning provides tools by which large quantities of
data can be automatically analyzed to overcome these
limitations and difficulties.
Analysis of urban area population increase
Marketing analysis of store departments
Formal Concept Analysis is a technique that enables resolution of such problems.
4
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
5. Contribution
We formulate a computational model for
binary classification process using formal concept
analysis.
The classification rules are derived and applied
successfully for different study cases.
5
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
6. Introduction
Machine learning is concerned (on the whole) with
concept learning and classification learning. The latter
is simply a generalization of the former.
Classification permits predictions to be derived on the
basis of common properties of a class of entities or
phenomena.
We will concern on the second approach of the AL
that is concerned on the classification learning.
Classification learning is a learning algorithm for
classifying unseen examples into predefined classes
based on a set of training examples.
6
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
7. Background
Classification Learning
Generalize classes description by identifying the
common “core” characteristics of a set of training
objects to generate knowledge that will enable novel
objects to be identified as belonging to one of the
classes.
Classification Learning Algorithms
Statistical
Classification
Decision Trees
Neural Networks
Backpropagation
CART
7
Prism
SVM
Naïve
Bayes
Bayesian
Networks
Symbolic
algorithms
The proposed
model
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
8. Background
Formal Concept Analysis (1)
Formal Concept Analysis (FCA) is a method used for
investigating and processing explicitely given information,
in order to allow for meaningful and comprehensive
interpretation.
Proposed by Wille.
An analysis of data.
Structures of formal abstractions of concepts of human
thought.
Formal emphasizes that the concepts are mathematical
objects, rather than concepts of mind.
8
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
9. Background
Formal Concept Analysis (2)
What is a Concept?
A formal concept is constituted by two parts
A: set of
objects
9
Relations
B: set of
attributes
Having a certain relation
Every object belonging to this concept has all the attributes
in B.
Every attribute belonging to this concept is shared by all
objects in A.
A is called the concept's extent.
B is called the concept's intent.
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
10. Background
Formal Concept Analysis (3)
Input
matrix specifying a set
of objects and attributes
Output
FCA
clusters of attributes
clusters of objects
Object cluster is the set of all objects that share a common
subset of attributes.
Attribute cluster is the set of all attributes shared by one of
the natural object clusters.
Duck
Goose
Parrot
…
Object_1
Objects
10
Object_2
relation
Has beak
Has feather
Has two legs
…
Attribute_1
Attribute_2
Attribute_3
Attributes
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
11. Background
Mathematical Definition of FCA
A formal concept is defined within a context.
Definition 1 A formal context is (O, A, R) where O
(objects) and A (attributes), and R is a binary relation
between O and A.
Equation (1) represents the set of attributes common
to the objects in M, while the set of objects which
have all attributes in B is represented as in Equation
(2).
M ' a A o R a for all o M
(1)
(2)
B ' o O o R a for all a B
Definition 2 A formal concept of the context (O, A, R) is
a pair (M, B) of M O , B A , B’= M and M’=B.
11
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
12. Background
Computational Models
Assume that the human brain is an information
processing system and that thinking is a form of
computing.
Processes information by taking input and follows, a
step-by-step algorithm to get a specific output.
The aim of computational modeling is to:
increase our knowledge.
improve our understanding of how the human
brain works
build computer systems that can execute a given
task optimally and in the most efficient possible
way.
12
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
13. The Proposed
Computational Model
Induces the classification rules which characterize
each class.
In the proposed model:
1. Convert the given data into a binary data. Binary
data are data those unit can take on only two
possible values termed 0 and 1. We do extension
to the collection of attributes by new attributes to
represent the binary data.
2. Use FCA to describe the classification process, so
the following two functions are presented:
R (o) a A| o, a R
N a o O | o, a R
13
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
14. The Proposed
Computational Model
Input Data
training data and a partition
of the training set OC1 , OC2
Convert the given data
into a binary data
Compute
k- Conjunction
for A
Add a to AC1 & R(a) to
D C1
R(a) OC1 ø
R(a) OC2= ø
Add a to AC2 & R(a) to
D C2
R(a) OC1 = ø
R(a) OC2 ø
Add R(a) with maximum no. of
objects in Dci to FCi
14
Find
R(a)
If R(a) O ø add attribute to FCi,
remove R(a) from O and Dci
While DCi ø
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
While O ø
While last attribute in k- Conjunction is not reached
15. Experimental Results And
Discussions (1)
The proposed model is applied to the following
datasets from the well-known UCI repository of
machine learning datasets that haven’t missing
attributes.
Table I. Datasets used in learning the concept classification
Dataset
Description
No. of
classes
No. of
attributes
No. of
instances
monk1
Monk’s Problem1
2
6
432
monk2
Monk’s Problem2
2
6
432
monk3
Monk’s Problem3
2
6
432
2
6
120
2
6
120
D1
D2
Acute Inflammations(Inflammation
of urinary bladder)
Acute Inflammations
(Nephritis of renal pelvis origin)
Note: Monk3 problem contains 5% noise data.
15
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
16. Experimental Results And
Discussions (2)
Some performance indices are calculated for the
proposed model such as the following, where TP (true
positive), TN (true negative), FP (false positive), FN
(false negative).
TP
TN
Sensitivity Recall
,
Specificity
,
TP FN
TN FP
Accuracy
TP TN
,
TP FP TN FN
F Measure
2( Precision Recall )
,
Precision Recall
TP
PP Precision
,
TP FP
TN
NP
TN FN
The performance indices of the proposed model are
compared with Support Vector Machine (SVM) and
Classification and Regression Tree (CART).
16
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
17. Experimental Results And
Discussions (3)
Data
Table II. Comparison of classification accuracy: SVM, CART and the
proposed model (P. model)
SVM
monk1 66.20%
monk2 63.19%
monk3 78.70%
D1
100.0%
D2
100.0%
17
Correct
CART
83.33%
61.11%
97.22%
85.00%
100.0%
Incorrect
P. model
92.59%
63.66%
86.11%
100.0%
100.0%
SVM
CART P. model
33.80% 16.67% 0.00%
36.81% 38.89% 18.06%
21.30% 2.78%
7.18%
0.00% 15.00% 0.00%
0.00% 0.00%
0.00%
Misclassified
SVM
0.00%
0.00%
0.00%
0.00%
0.00%
CART
0.00%
0.00%
0.00%
0.00%
0.00%
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
P. model
7.41%
18.28%
6.71%
0.00%
0.00%
18. Experimental Results And
Discussions (4)
Table III. Comparison of classification accuracy: SVM, CART and the
proposed model (P. model) : misclassified assigned to majority class
Dataset
monk1
monk2
monk3
D1
D2
18
Correct Accuracy
SVM
CART
P. model
66.20%
83.33%
100.0%
63.19%
61.11%
76.39%
78.70%
97.22%
87.27%
100.0%
85.00%
100.0%
100.0%
100.0%
100.0%
Incorrect Accuracy
SVM
CART
P. model
33.80%
16.67%
0.00%
36.81%
38.89%
23.61%
21.30%
2.78%
12.73%
0.00%
15.00%
0.00%
0.00%
0.00%
0.00%
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
19. Experimental Results And
Discussions (5)
Table IV. Comparison of performance indices for SVM, CART and the
proposed model for monk1
TP
TN
FP
FN
Sen.
Spe.
PP
NP
FM
P. model
216
216
0
0
100%
100%
100%
100%
100%
SVM
137
149
67
79
63.43% 68.98% 67.16%
65.35%
65.24%
CART
168
192
24
48
77.78% 88.89% 87.50%
80.00%
82.35%
Table V. Comparison of performance indices for SVM, CART and the
proposed model for monk2
TP
FP
FN
P. model
244
86
56
46
SVM
259
14
128
CART
19
TN
199
65
77
Sen.
Spe.
PP
NP
FM
84.14% 60.56% 81.33%
65.15%
82.71%
31
89.31%
9.86%
66.93%
31.11%
76.52%
91
68.62%
45.77% 72.10%
41.67%
70.32%
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
20. Experimental Results And
Discussions (6)
Table VI. Comparison of performance indices for SVM, CART and the
proposed model for monk3
TP
TN
FP
FN
P. model
199
179
49
5
SVM
157
183
45
47
CART
204
216
12
0
Sen.
NP
FM
97.55% 78.51% 80.24%
97.28%
88.05%
76.96% 80.26% 77.72%
79.57%
77.34%
100%
97.14%
100%
Spe.
PP
94.74% 94.44%
Table VII. Comparison of performance indices for SVM, CART and the
proposed model for inflammation of urinary bladder
TP
FP
FN
Sen.
Spe.
PP
NP
FM
P. model
23
17
0
0
100%
100%
100%
100%
100%
SVM
23
17
0
0
100%
100%
100%
100%
100%
CART
20
TN
17
17
0
6
73.91%
100%
100%
73.91%%
85.00%
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
21. Experimental Results And
Discussions (7)
Table VIII. Comparison of performance indices for SVM, CART and the
proposed model for D2
TP
TN
FP
FN
Sen.
Spe.
PP
NP
FM
P. model
17
23
0
0
100%
100%
100%
100%
100%
SVM
17
23
0
0
100%
100%
100%
100%
100%
CART
17
23
0
0
100%
100%
100%
100%
100%
ROC curve is a graphical plot that illustrates the
performance of a binary classifier system.
ROC is created by plotting the fraction of true positives
out of the positives (sensitivity) vs. the fraction of false
positives out of the negatives (1-specificity)
21
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
22. Experimental Results And
Discussions (8)
ROC curve for Monk1
22
ROC curve for Monk2
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
23. Experimental Results And
Discussions (9)
ROC curve for Monk3
23
ROC curve for D1
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
24. Experimental Results And
Discussions (10)
ROC curve for D2
24
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT
25. Conclusion
Artificial learning is concerned with the classification
learning that is a supervised learning algorithm
embodied in the human mind.
Proposed a computational model for classification
learning process which is described in terms of formal
concept analysis (FCA).
The proposed model characterizes each class and predict
the class label of a new object.
The performance of the proposed model has been
evaluated for the real world data which led to get on
classification rules from the training data that enable us
from predicting the outcome of unseen data in a test set.
The proposed model has superior performance
comparing with CART and SVM.
25
8th International Conference on Computer Engineering and Systems (ICCES’2013), EGYPT