SlideShare una empresa de Scribd logo
1 de 5
Kovalerchuk, B., Vityaev E., Comparison of relational methods and attribute-based methods for data mining in
     intelligent systems, 1999 IEEE Int. Symposium on Intelligent Control/Intelligent Systems, Cambridge, Mass,
     1999. pp. 162-166.

      Comparison of relational methods and attribute-based methods
                  for data mining in intelligent systems
                                                    Boris Kovalerchuk
                              Dept. of Computer Science, Central Washington University,
                                         Ellensburg, WA, 98926-7520, USA
                                                  borisk@cwu.edu
                                                      Evgenii Vityaev
                                Institute of Mathematics, Russian Academy of Science,
                                              Novosibirsk, 630090, Russia
                                                  vityaev@math.nsc.ru



                         Abstract                               that automatically improve with experience. In recent years
Most of the data mining methods in real-world intelligent       many successful machine learning applications have been
systems are attribute-based machine learning methods such       developed including autonomous vehicles that learn to
as neural networks, nearest neighbors and decision trees.       drive on public [1].
They are relatively simple, efficient, and can handle noisy
data. However, these methods have two strong limitations:       Currently statistical and Artificial Neural Network
(1) a limited form of expressing the background                 methods dominate in design of intelligent systems and data
knowledge and (2) the lack of relations other than “object-     mining. There are three shortages of Neural Networks [1]
attribute” makes the concept description language               for forecasting related to:
inappropriate for some applications.                            1) Explainability,
                                                                2) Use of logical relations and
Relational hybrid data mining methods based on first-order      3) Tolerance for sparse data.
logic were developed to meet these challenges. In the
paper they are compared with Neural Networks and other          Alternative relational (symbolic) machine learning
benchmark methods. The comparison shows several                 methods had shown their effectiveness in robotics
advantages of relational methods.                               (navigation, 3-dimensional scene analysis) and drug design
                                                                (selection of the most promising components for drug
                                                                design). . In practice, learning systems based on first-order
1.    Problem definition and objectives                         representations have been successfully applied to many
                                                                problems in engineering, chemistry, physics, medicine.
Performance of intelligent systems such as intelligent          finance and other fields [1,2]. Traditionally symbolic
control systems can be significantly improved if control        methods are used in the areas with a lot of non-numeric
signals are generated using prediction of future behavior of    (symbolic) knowledge. In robot navigation this is relative
the controlled system. In this study we assume that at each     location of obstacles (on the right, on the left and so on).
moment t performance of the system is measured by gain          We discuss the key algorithms and theory that form the
function G(t,yt,yt-1,u), were yt and yt-1 are system’s states   core of symbolic machine learning methods for
in t and previous moment t-1, respectively and u is control     applications with dominating numerical data. The
signal at the same moment t. Implementing this approach         mathematical formalisms of first order logic rules
requires discovering regularities in system’s behavior and      described in [1,3,4] are used. Note that a class of general
computing y ~ , predicted values of system’s state using
              t                                                 propositional and first-order logic rules, covered by
discovered regularities.                                        relational methods is wider than a class of decision trees
                                                                [1, pp. 274-275].
Data mining methods are design to for discovering hidden
regularities in databases. Data mining has two major
sources to infer regularities: database and machine learning    Specifically Relational hybrid Data Mining (RHDM)
technologies. The field of machine learning is concerned        combines inductive logic programming (ILP) with
with the question of how to construct computer programs         probabilistic inference. The combination benefits from
noise robust probabilistic inference and highly expressive       rely only on them then there are more chances that these
and understandable first-order logic rules employed in ILP.      rules will not deliver a right forecast on other data.
Relational Data Mining technology is a data modeling
                                                                 What is the motivation to use suggested MMDR method in
algorithm that does not assume the functional form of
                                                                 particular? MMDR uses hypothesis/rule generation and
the relationship being modeled a priori.            It can
                                                                 selection process, based on fundamental representative
automatically consider a large number of inputs (e.g., time
                                                                 measurement theory [3] The original challenge for MMDR
series characterization parameters) and learn how to
                                                                 was the simulation of discovering scientific laws from
combine these to produce estimates for future values of a
                                                                 empirical data in chemistry and physics. There is a well-
specific output variable.
                                                                 known difference between “black box” models and
                                                                 fundamental models (laws) in modern physics. The last
Relational data mining (RDM) has roots in logic
                                                                 ones have much longer life, wider scope and a solid
programming which provides the solid theoretical basis for
                                                                 background. Mitchell in [1] noted the importance that
RDM. On the other hand at present existing Inductive
                                                                 relational assertions “can be conveniently expressed using
Logic Programming systems representing RDM are
                                                                 first-order representations, while they are very difficult to
relatively inefficient and have rather limited facilities for
                                                                 describe using propositional representations”.
handling numerical data [5]. We developed a hybrid ILP
and probabilistic technique (MMDR method) that handles
                                                                 Many well-known rule learners such as AQ, CN2 are
numerical data efficiently [2,6].
                                                                 propositional [1, p.279, 283]. Note that decision tree
                                                                 methods represent a particular type of propositional
One of the main advantages of RDM over attribute-based
                                                                 representation [1, p.275]. Therefore decision tree methods
learning is ILP’s generality of representation for
                                                                 such as ID3 and its successor C4.5 fit better to tasks
background knowledge. This enables the user to provide,
                                                                 without relational assertions. Mitchell argues and gives
in a more natural way, domain-specific background
                                                                 examples that propositional representations offer no
knowledge to be used in learning. The use of background
                                                                 general way to describe the essential relations among the
knowledge enables the user both to develop a suitable
                                                                 values of the attributes [1, pp. 283-284]. Below we follow
problem representation and to introduce problem-specific
                                                                 his simple illustrative example. In contrast with
constraints into the learning process. By contrast, attribute-
                                                                 propositional rules, a program using first-order
based learners can typically accept background knowledge
                                                                 representations could learn the following general rule:
in rather limited form only [5].
                                                                 IF Father(x,y) & Female(y), THEN Daugher(x,y),
2.   Relational methods
                                                                 where x and y are variables that can be bound to any
                                                                 person. For the target concept Daughter1,2 propositional
A machine learning type of method, called Machine                rule learner such as CN2 or C4.5, the result would be a
Methods for Discovering Regularities (MMDR) is applied           collection of very specific rules such as
for forecasting time series. Then this forecast is used to
compute control signals to get a better value of gain            IF (Father1=Bob)&Name2=Bob)&Female1=True) THEN
function G.                                                      Daughter1,2=True.

MMDR method expresses patterns in first order logic and          Although it is correct, this rule is so specific that it will
assigns probabilities to rules generated by composing            rarely, if ever, be useful in classifying future pairs of
patterns. As any technique based on first order logic,           people [4, pp.283-284]. It is obvious that similar problem
MMDR allows one to get human-readable forecasting                exists for autoregression methods like ARIMA and Neural
rules [1, ch. 10], i.e. understandable in ordinary language      Networks methods. First-order logic rules have an
in addition to the forecast. A field expert can evaluate the     advantage in discovering relational assertions because they
performance of the forecast as well as a forecasting rule.       capture relations directly, e.g., Father(x,y) in the example
                                                                 above.
Also, as any technique based on probabilistic estimates,
this technique delivers rules tested on their statistical        In addition, first order rules allow one to express naturally
significance. Statistically significant rules have advantage     the other more general hypotheses not only the relation
in comparison with rules tested only for their performance       between pairs of attributes [3]. These more general rules
on training and test data [1, ch. 5]. Training and testing       can be as for classification problems as for an interval
data can be too limited and/or not representative. If rules
forecast of continuous variable. Moreover these rules are        The next critical issue in applying data-driven forecasting
able to catch Markov chain type of models used for time          systems is generalization.      The "Discovery" system
series forecast. We share Mitchell’s opinion about the           developed as software implementation MMDR [6]
importance of algorithms designed to learn sets of first-        generalizes data through “lawlike” logical probabilistic
order rules that contain variables because first-order rules     rules. Discovered rules have similar statistical estimate
are much more expressive than propositional rules” [1,           and significance on training and test sets of studied time
p.274].                                                          series. Theoretical advantages of MMDR generalization
                                                                 are presented in [6, 1].
What is the difference between MMDR and other Machine
Learning methods dealing with first-order logic [1,4,5]?                   3.    Method for discovering regularities
From our viewpoint the main accent in other first-order
methods is on two computational complexity issues: how           Figure 1 describes the steps of MMDR. On the first step
wide is the class of hypotheses tested by the particular         we select and/or generate a class first–order logic rules
machine learning algorithms and how to construct a               suitable for a particular task.
learning algorithm to find deterministic rules. The
emphasis of MMDR is on probabilistic first-order rules           The next step is learning the particular first-order logic
and measurement issues, i.e., how we can move from a             rules using available training data. Then we test first-order
real measurement to first-order logic representation. This is    logic rules on training and test data using Fisher statistical
a non-trivial task [3]. For example, how to represent            criterion. After that we select
temperature measurement in terms of first-order logic
                                                                              MMDR models
without losing the essence of the attribute (temperature in         (selecting/generating logical rules
this case) and without inputting unnecessary conventional                  with variables x,y,..,z:
properties? For instance, Fahrenheit and Celsius zeros of             IF A(x,y,…,z)THEN B(x,y,…,z)
temperature are our conventions in contrast with Kelvin
scale where the zero is a real physical zero. There are no
                                                                    Learning logical rules on training       Testing and selecting
temperatures less than this zero. Therefore incorporating                data using conditional             logical rules (Occam’s
properties of the Fahrenheit zero into first-order rules may           probabilities of inference           razor, Fisher criterion)
force us to discover/learn properties of this convention                P(B(x,y,…,z)/A(x,y,…z))
along with more significant scale invariant forecasting
rules. Learning algorithms in the space with those kind of           Creating interval and threshold
accidental properties may be very time consuming and                      forecasts using rules
                                                                     IF A(x,y,…,z) THEN B(x,y,…,z)
may produce inappropriate rules.
                                                                              and p-quintiles

  It is well known that the general problem of rule                 Figure 1. Flow diagram for MMDR: steps and technique applied
generating and testing is NP-complete. Therefore the
discussion above is closely related to the following             statistically significant rules and apply Occam’s razor
questions. What determines the number of rules and when          principle: prefer the simplest hypothesis (rules) that fits the
to stop generating rules? What is the justification for          data [1, p. 65]. Simultaneously we use the rules’
specifying particular expressions instead of any other           performance on training and test data for their selection.
expressions? Using the approach from [3] we select rules         We may iterate back and forth among these three steps
which are simplest and consistent with measurement scales        several times to find the best rules. The last step is creating
for a particular task. The algorithm stops generating new        interval and threshold forecasts using selected first-order
rules when they become too complex (i.e., statistically          logic rules:
insignificant for the data) in spite of possible high accuracy
on training data. The obvious other stop criterion is time                 IF A(x,y,…,z) THEN B(x,y,…,z).
limitation. Detailed discussion about a mechanism of
initial rule selection from measurement theory [3]               As we mentioned above conceptually law-like rules came
viewpoint is out of the scope of this paper. A special study     from philosophy of science. These rules attempt to
may result in a catalogue of initial rules/hypotheses to be      mathematically capture the essential features of scientific
tested (learned) for particular applications. In this way any    laws:
field analyst can choose rules to be tested without              (1) High level of generalization;
generating them.                                                 (2) Simplicity (Occam’s razor); and,
                                                                 (3) Refutability.
The first feature -- generalization -- means that any other    “Law-like” rules defined in this way hold all three
regularity covering the same events would be less general,     mentioned above properties of scientific laws: generality,
i.e., applicable only to the part of events covered by the     simplicity, and refutability..
law-like regularity. The second feature – simplicity--         The “Discovery” software searches all chains
reflects the fact that a law-like rule is shorter than other
rules. A law-like rule, R1 is more refutable than another                           C1 , C2, …, Cm-1, Cm
rule R2 if there are more testing examples which refute R1
than R2, but the testing examples fail to refute R1.           of nested “law-like” subrules, where C1 is a subrule of rule
                                                               C2 , C1 = sub(C2), C2 is a subrule of rule C3, C2 = sub(C3)
Formally, we present an IF-THEN rule C as                      and finally Cm-1 is a subrule of rule Cm, Cm-1 = sub(Cm). In
                                                               addition, they satisfy an important property:
                     A1& …&Ak ⇒ A0,
                                                                    Prob(C1) < Prob(C2), … , Prob(Cm-1) < Prob(Cm).
where the IF-part, A1&...&Ak, consists of true/false logical
statements A1,…,Ak ,and the THEN-part consists of a            This property is the base for the following
single logical statement A0. Each statements Ai is a given     theorem [6]:
refutable statements or its negations, which are also
refutable. Rule C allows us to generate sub-rules with a          All rules, which have a maximum value of
truncated IF part, e.g.                                           conditional probability, can be found at the end of
                                                                  such chains.
             A1&A2 ⇒ A0 , A1&A2&A3 ⇒ A0
                                                               This theorem basically means that the MMDR algorithm
and so on. For rule C its conditional probability              does not miss the best rules. The algorithm stops
                                                               generating new rules when they become too complex (i.e.,
              Prob(C) = Prob(A0/A1&...&Ak)                     statistically insignificant for the data) even if the rules are
                                                               highly accurate on training data. The Fisher statistical
is defined. Similarly conditional probabilities                criterion is used in this algorithm for testing statistical
                                                               significance. The obvious other stop criterion is time
                   Prob(A0/Ai1&...&Aih)                        limitation.

are defined for sub-rules Ci of the form                                             4. Performance

                    Ai1& …&Aih ⇒ A0.                           We consider a control task with three control actions u=1,
                                                               u=0 and u=-1, set as follows:
We use conditional probability                                                   1, if y ~ > y
                                                                                          t     t -1
                                                                                
                                                                                         ~
              Prob(C) = Prob(A0/A1&...&Ak)                                u =  0, if y t = y t -1
                                                                                           ~
for estimating forecasting power of the rule to predict A0.                     - 1, if y t < y t -1
                                                                                
                                                               where yt-1 is an actual value of the time series for t-1
The goal is to find ‘law-like” rules. The rule is “law-like”   moment and y ~ is the forecasted value of y for t moment.
                                                                              t
if and only if all of its sub-rules have less conditional      Forecasted values are generated by all four studied
probability than the rule, and statistical significance of     methods including Neural Network method and relational
that is established. Each sub-rule Ci generalizes rule C,      MMDR method.
i.e., potentially Ci is true for larger set of instances.
Another definition of “law-like” rules can be stated in        The gain function G is computed for every t after all actual
terms of generalization.                                       values of yt became known. Gain function G(t,u,yt ,yt-1)
                                                               depends on u and actual values of output yt ,yt-1:
The rule is “law-like” if and only if it can not be
generalized without producing a statistically significant
reduction in its conditional probability.
 y ~ − y t -1 , if u = 1
                             t
                         
                                                                                            5. Conclusion
G(t, u, y t , y t -1 ) =  0, if u = 0
                          ~                                          In average human-readable and understadable regularities
                         - (y t − y t -1 ), if u = −1
                         
                                                                      generated by the relational method MMDR outperformed
                                                                      other methods including widely used neural networks
The total gain is defined as                                          (table 1).
                      Gtotal=Σt=1,n G(t,u, yt ,yt-1)
                                                                                             6. References
The adaptive linear method in essence works according to
                               ~
the following fomula: y t =ay t-1+ byt-2 +c.                          1.   Mitchell T. “Machine Learning”, McGraw-Hill, NY,
                                                                           1997
“Passive control” method ignores forecast yt and in all               2.   Kovalerchuk, B., Vityaev E. “Discovering Law-like
cases generates the same control signal “do nothing”:                      Regularities in Financial Time Series”, Journal of
                                                                           Computational Intelligence in Finance, Vol.6, No.3,
If (yt ~>yt-1 or yt ~≤yt-1) then u=0 (do nothing)                          1998, pp.12-26,
Table 1 shows performance of MMDR in conparison with                  3.   Krantz D.H., Luce R.D., Suppes P., and Tversky A.
three other methods: Adaptive Linear, Passive Contol, and                  “Foundations of measurement”. vol.1-3, Acad. Press,
Neural Networks.                                                           NY, London, 1971, 1989, 1990
                                                                      4.   Russel S., Norvig P. “Artificial Intelligence. A modern
Table 1. Simulated gain                                                    approach”, Prentice Hall, 1995
                                                                      5.   Bratko, I., Muggleton. S. “Applications of
 Method                                 Gain (%)
                                                                           inductive logic programming”, Communications
                               Data      Data             Average
                               set 1      set 2          (two data         of ACM, vol.38, N. 11,1995, pp.65-70.
                               (477       (424            sets, 901   6.   Vityaev E., Moskvitin A. “Introduction to Discovery
                            instances) instances)        instances)
                                                                           theory: Discovery system”, Computational Systems,
 Adaptive Linear               21.9      18.28              20.09          v.148, Novosibirsk, pp.117-163, 1993 (in Russian).
 MMDR                         26.69      43.83              35.26
 Passive control              30.39      20.56              25.47
 Neural Network               18.94      16.07               17.5

Más contenido relacionado

La actualidad más candente

Computation of Neural Network using C# with Respect to Bioinformatics
Computation of Neural Network using C# with Respect to BioinformaticsComputation of Neural Network using C# with Respect to Bioinformatics
Computation of Neural Network using C# with Respect to BioinformaticsSarvesh Kumar
 
Test PDF
Test PDFTest PDF
Test PDFAlgnuD
 
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONMULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONgerogepatton
 
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...ijaia
 
New Generation Routing Protocol over Mobile Ad Hoc Wireless Networks based on...
New Generation Routing Protocol over Mobile Ad Hoc Wireless Networks based on...New Generation Routing Protocol over Mobile Ad Hoc Wireless Networks based on...
New Generation Routing Protocol over Mobile Ad Hoc Wireless Networks based on...ijasuc
 
ETRnew.doc.doc
ETRnew.doc.docETRnew.doc.doc
ETRnew.doc.docbutest
 
Adaptive Neural Fuzzy Inference System for Employability Assessment
Adaptive Neural Fuzzy Inference System for Employability AssessmentAdaptive Neural Fuzzy Inference System for Employability Assessment
Adaptive Neural Fuzzy Inference System for Employability AssessmentEditor IJCATR
 
6. kr paper journal nov 11, 2017 (edit a)
6. kr paper journal nov 11, 2017 (edit a)6. kr paper journal nov 11, 2017 (edit a)
6. kr paper journal nov 11, 2017 (edit a)IAESIJEECS
 
Text classification based on gated recurrent unit combines with support vecto...
Text classification based on gated recurrent unit combines with support vecto...Text classification based on gated recurrent unit combines with support vecto...
Text classification based on gated recurrent unit combines with support vecto...IJECEIAES
 
Object recognition with cortex like mechanisms pami-07
Object recognition with cortex like mechanisms pami-07Object recognition with cortex like mechanisms pami-07
Object recognition with cortex like mechanisms pami-07dingggthu
 
Artigo - Aplicações Interativas para TV Digital: Uma Proposta de Ontologia de...
Artigo - Aplicações Interativas para TV Digital: Uma Proposta de Ontologia de...Artigo - Aplicações Interativas para TV Digital: Uma Proposta de Ontologia de...
Artigo - Aplicações Interativas para TV Digital: Uma Proposta de Ontologia de...Diego Armando
 
APPLYING NEURAL NETWORKS FOR SUPERVISED LEARNING OF MEDICAL DATA
APPLYING NEURAL NETWORKS FOR SUPERVISED LEARNING OF MEDICAL DATAAPPLYING NEURAL NETWORKS FOR SUPERVISED LEARNING OF MEDICAL DATA
APPLYING NEURAL NETWORKS FOR SUPERVISED LEARNING OF MEDICAL DATAIJDKP
 
IRJET- Deep Learning Techniques for Object Detection
IRJET-  	  Deep Learning Techniques for Object DetectionIRJET-  	  Deep Learning Techniques for Object Detection
IRJET- Deep Learning Techniques for Object DetectionIRJET Journal
 
Artificial Neural Networks: Applications In Management
Artificial Neural Networks: Applications In ManagementArtificial Neural Networks: Applications In Management
Artificial Neural Networks: Applications In ManagementIOSR Journals
 

La actualidad más candente (19)

Computation of Neural Network using C# with Respect to Bioinformatics
Computation of Neural Network using C# with Respect to BioinformaticsComputation of Neural Network using C# with Respect to Bioinformatics
Computation of Neural Network using C# with Respect to Bioinformatics
 
Jack
JackJack
Jack
 
Test PDF
Test PDFTest PDF
Test PDF
 
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONMULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
 
Ch35473477
Ch35473477Ch35473477
Ch35473477
 
Dc32644652
Dc32644652Dc32644652
Dc32644652
 
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
 
New Generation Routing Protocol over Mobile Ad Hoc Wireless Networks based on...
New Generation Routing Protocol over Mobile Ad Hoc Wireless Networks based on...New Generation Routing Protocol over Mobile Ad Hoc Wireless Networks based on...
New Generation Routing Protocol over Mobile Ad Hoc Wireless Networks based on...
 
ETRnew.doc.doc
ETRnew.doc.docETRnew.doc.doc
ETRnew.doc.doc
 
Artificial Intelligence of the Web through Domain Ontologies
Artificial Intelligence of the Web through Domain OntologiesArtificial Intelligence of the Web through Domain Ontologies
Artificial Intelligence of the Web through Domain Ontologies
 
Adaptive Neural Fuzzy Inference System for Employability Assessment
Adaptive Neural Fuzzy Inference System for Employability AssessmentAdaptive Neural Fuzzy Inference System for Employability Assessment
Adaptive Neural Fuzzy Inference System for Employability Assessment
 
6. kr paper journal nov 11, 2017 (edit a)
6. kr paper journal nov 11, 2017 (edit a)6. kr paper journal nov 11, 2017 (edit a)
6. kr paper journal nov 11, 2017 (edit a)
 
Text classification based on gated recurrent unit combines with support vecto...
Text classification based on gated recurrent unit combines with support vecto...Text classification based on gated recurrent unit combines with support vecto...
Text classification based on gated recurrent unit combines with support vecto...
 
Object recognition with cortex like mechanisms pami-07
Object recognition with cortex like mechanisms pami-07Object recognition with cortex like mechanisms pami-07
Object recognition with cortex like mechanisms pami-07
 
Artigo - Aplicações Interativas para TV Digital: Uma Proposta de Ontologia de...
Artigo - Aplicações Interativas para TV Digital: Uma Proposta de Ontologia de...Artigo - Aplicações Interativas para TV Digital: Uma Proposta de Ontologia de...
Artigo - Aplicações Interativas para TV Digital: Uma Proposta de Ontologia de...
 
APPLYING NEURAL NETWORKS FOR SUPERVISED LEARNING OF MEDICAL DATA
APPLYING NEURAL NETWORKS FOR SUPERVISED LEARNING OF MEDICAL DATAAPPLYING NEURAL NETWORKS FOR SUPERVISED LEARNING OF MEDICAL DATA
APPLYING NEURAL NETWORKS FOR SUPERVISED LEARNING OF MEDICAL DATA
 
IRJET- Deep Learning Techniques for Object Detection
IRJET-  	  Deep Learning Techniques for Object DetectionIRJET-  	  Deep Learning Techniques for Object Detection
IRJET- Deep Learning Techniques for Object Detection
 
395 404
395 404395 404
395 404
 
Artificial Neural Networks: Applications In Management
Artificial Neural Networks: Applications In ManagementArtificial Neural Networks: Applications In Management
Artificial Neural Networks: Applications In Management
 

Destacado

GRM98012IGR.doc
GRM98012IGR.docGRM98012IGR.doc
GRM98012IGR.docbutest
 
Dealing with Diversity: Understanding WCF Communication Options in ...
Dealing with Diversity: Understanding WCF Communication Options in ...Dealing with Diversity: Understanding WCF Communication Options in ...
Dealing with Diversity: Understanding WCF Communication Options in ...butest
 
Garbage Collection, Program Comprehension and Machine Learning
Garbage Collection, Program Comprehension and Machine LearningGarbage Collection, Program Comprehension and Machine Learning
Garbage Collection, Program Comprehension and Machine Learningbutest
 
Machine Learning for Adversarial Agent Microworlds
Machine Learning for Adversarial Agent MicroworldsMachine Learning for Adversarial Agent Microworlds
Machine Learning for Adversarial Agent Microworldsbutest
 
pptx - Distributed Parallel Inference on Large Factor Graphs
pptx - Distributed Parallel Inference on Large Factor Graphspptx - Distributed Parallel Inference on Large Factor Graphs
pptx - Distributed Parallel Inference on Large Factor Graphsbutest
 
MapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine LearningMapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine Learningbutest
 
Course Syllabus
Course SyllabusCourse Syllabus
Course Syllabusbutest
 
An Eye On Google, Executive Summary Presentation
An Eye On Google, Executive Summary PresentationAn Eye On Google, Executive Summary Presentation
An Eye On Google, Executive Summary PresentationKetzirah Lesser
 
Shrm poll diversity_final
Shrm poll diversity_finalShrm poll diversity_final
Shrm poll diversity_finalshrm
 

Destacado (9)

GRM98012IGR.doc
GRM98012IGR.docGRM98012IGR.doc
GRM98012IGR.doc
 
Dealing with Diversity: Understanding WCF Communication Options in ...
Dealing with Diversity: Understanding WCF Communication Options in ...Dealing with Diversity: Understanding WCF Communication Options in ...
Dealing with Diversity: Understanding WCF Communication Options in ...
 
Garbage Collection, Program Comprehension and Machine Learning
Garbage Collection, Program Comprehension and Machine LearningGarbage Collection, Program Comprehension and Machine Learning
Garbage Collection, Program Comprehension and Machine Learning
 
Machine Learning for Adversarial Agent Microworlds
Machine Learning for Adversarial Agent MicroworldsMachine Learning for Adversarial Agent Microworlds
Machine Learning for Adversarial Agent Microworlds
 
pptx - Distributed Parallel Inference on Large Factor Graphs
pptx - Distributed Parallel Inference on Large Factor Graphspptx - Distributed Parallel Inference on Large Factor Graphs
pptx - Distributed Parallel Inference on Large Factor Graphs
 
MapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine LearningMapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine Learning
 
Course Syllabus
Course SyllabusCourse Syllabus
Course Syllabus
 
An Eye On Google, Executive Summary Presentation
An Eye On Google, Executive Summary PresentationAn Eye On Google, Executive Summary Presentation
An Eye On Google, Executive Summary Presentation
 
Shrm poll diversity_final
Shrm poll diversity_finalShrm poll diversity_final
Shrm poll diversity_final
 

Similar a Comparison of relational and attribute-IEEE-1999-published ...

A scenario based approach for dealing with
A scenario based approach for dealing withA scenario based approach for dealing with
A scenario based approach for dealing withijcsa
 
Main single agent machine learning algorithms
Main single agent machine learning algorithmsMain single agent machine learning algorithms
Main single agent machine learning algorithmsbutest
 
Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...butest
 
Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...butest
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...Editor IJCATR
 
Classifier Model using Artificial Neural Network
Classifier Model using Artificial Neural NetworkClassifier Model using Artificial Neural Network
Classifier Model using Artificial Neural NetworkAI Publications
 
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemKnowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemIRJET Journal
 
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...cscpconf
 
Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...
Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...
Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...remAYDOAN3
 
A Review on Reasoning System, Types, and Tools and Need for Hybrid Reasoning
A Review on Reasoning System, Types, and Tools and Need for Hybrid ReasoningA Review on Reasoning System, Types, and Tools and Need for Hybrid Reasoning
A Review on Reasoning System, Types, and Tools and Need for Hybrid ReasoningBRNSSPublicationHubI
 
An Iterative Improved k-means Clustering
An Iterative Improved k-means ClusteringAn Iterative Improved k-means Clustering
An Iterative Improved k-means ClusteringIDES Editor
 
Ontology Based PMSE with Manifold Preference
Ontology Based PMSE with Manifold PreferenceOntology Based PMSE with Manifold Preference
Ontology Based PMSE with Manifold PreferenceIJCERT
 
Top cited articles 2020 - Advanced Computational Intelligence: An Internation...
Top cited articles 2020 - Advanced Computational Intelligence: An Internation...Top cited articles 2020 - Advanced Computational Intelligence: An Internation...
Top cited articles 2020 - Advanced Computational Intelligence: An Internation...aciijournal
 
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELSREPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELScscpconf
 
Pattern recognition using context dependent memory model (cdmm) in multimodal...
Pattern recognition using context dependent memory model (cdmm) in multimodal...Pattern recognition using context dependent memory model (cdmm) in multimodal...
Pattern recognition using context dependent memory model (cdmm) in multimodal...ijfcstjournal
 
June 2020: Top Read Articles in Advanced Computational Intelligence
June 2020: Top Read Articles in Advanced Computational IntelligenceJune 2020: Top Read Articles in Advanced Computational Intelligence
June 2020: Top Read Articles in Advanced Computational Intelligenceaciijournal
 
chalenges and apportunity of deep learning for big data analysis f
 chalenges and apportunity of deep learning for big data analysis f chalenges and apportunity of deep learning for big data analysis f
chalenges and apportunity of deep learning for big data analysis fmaru kindeneh
 
T OWARDS A S YSTEM D YNAMICS M ODELING M E- THOD B ASED ON DEMATEL
T OWARDS A  S YSTEM  D YNAMICS  M ODELING  M E- THOD B ASED ON  DEMATELT OWARDS A  S YSTEM  D YNAMICS  M ODELING  M E- THOD B ASED ON  DEMATEL
T OWARDS A S YSTEM D YNAMICS M ODELING M E- THOD B ASED ON DEMATELijcsit
 
factorization methods
factorization methodsfactorization methods
factorization methodsShaina Raza
 

Similar a Comparison of relational and attribute-IEEE-1999-published ... (20)

A scenario based approach for dealing with
A scenario based approach for dealing withA scenario based approach for dealing with
A scenario based approach for dealing with
 
Main single agent machine learning algorithms
Main single agent machine learning algorithmsMain single agent machine learning algorithms
Main single agent machine learning algorithms
 
Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...
 
Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...Competitive advantage from Data Mining: some lessons learnt ...
Competitive advantage from Data Mining: some lessons learnt ...
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
 
31 34
31 3431 34
31 34
 
Classifier Model using Artificial Neural Network
Classifier Model using Artificial Neural NetworkClassifier Model using Artificial Neural Network
Classifier Model using Artificial Neural Network
 
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemKnowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
 
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...
 
Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...
Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...
Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...
 
A Review on Reasoning System, Types, and Tools and Need for Hybrid Reasoning
A Review on Reasoning System, Types, and Tools and Need for Hybrid ReasoningA Review on Reasoning System, Types, and Tools and Need for Hybrid Reasoning
A Review on Reasoning System, Types, and Tools and Need for Hybrid Reasoning
 
An Iterative Improved k-means Clustering
An Iterative Improved k-means ClusteringAn Iterative Improved k-means Clustering
An Iterative Improved k-means Clustering
 
Ontology Based PMSE with Manifold Preference
Ontology Based PMSE with Manifold PreferenceOntology Based PMSE with Manifold Preference
Ontology Based PMSE with Manifold Preference
 
Top cited articles 2020 - Advanced Computational Intelligence: An Internation...
Top cited articles 2020 - Advanced Computational Intelligence: An Internation...Top cited articles 2020 - Advanced Computational Intelligence: An Internation...
Top cited articles 2020 - Advanced Computational Intelligence: An Internation...
 
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELSREPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
 
Pattern recognition using context dependent memory model (cdmm) in multimodal...
Pattern recognition using context dependent memory model (cdmm) in multimodal...Pattern recognition using context dependent memory model (cdmm) in multimodal...
Pattern recognition using context dependent memory model (cdmm) in multimodal...
 
June 2020: Top Read Articles in Advanced Computational Intelligence
June 2020: Top Read Articles in Advanced Computational IntelligenceJune 2020: Top Read Articles in Advanced Computational Intelligence
June 2020: Top Read Articles in Advanced Computational Intelligence
 
chalenges and apportunity of deep learning for big data analysis f
 chalenges and apportunity of deep learning for big data analysis f chalenges and apportunity of deep learning for big data analysis f
chalenges and apportunity of deep learning for big data analysis f
 
T OWARDS A S YSTEM D YNAMICS M ODELING M E- THOD B ASED ON DEMATEL
T OWARDS A  S YSTEM  D YNAMICS  M ODELING  M E- THOD B ASED ON  DEMATELT OWARDS A  S YSTEM  D YNAMICS  M ODELING  M E- THOD B ASED ON  DEMATEL
T OWARDS A S YSTEM D YNAMICS M ODELING M E- THOD B ASED ON DEMATEL
 
factorization methods
factorization methodsfactorization methods
factorization methods
 

Más de butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

Más de butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

Comparison of relational and attribute-IEEE-1999-published ...

  • 1. Kovalerchuk, B., Vityaev E., Comparison of relational methods and attribute-based methods for data mining in intelligent systems, 1999 IEEE Int. Symposium on Intelligent Control/Intelligent Systems, Cambridge, Mass, 1999. pp. 162-166. Comparison of relational methods and attribute-based methods for data mining in intelligent systems Boris Kovalerchuk Dept. of Computer Science, Central Washington University, Ellensburg, WA, 98926-7520, USA borisk@cwu.edu Evgenii Vityaev Institute of Mathematics, Russian Academy of Science, Novosibirsk, 630090, Russia vityaev@math.nsc.ru Abstract that automatically improve with experience. In recent years Most of the data mining methods in real-world intelligent many successful machine learning applications have been systems are attribute-based machine learning methods such developed including autonomous vehicles that learn to as neural networks, nearest neighbors and decision trees. drive on public [1]. They are relatively simple, efficient, and can handle noisy data. However, these methods have two strong limitations: Currently statistical and Artificial Neural Network (1) a limited form of expressing the background methods dominate in design of intelligent systems and data knowledge and (2) the lack of relations other than “object- mining. There are three shortages of Neural Networks [1] attribute” makes the concept description language for forecasting related to: inappropriate for some applications. 1) Explainability, 2) Use of logical relations and Relational hybrid data mining methods based on first-order 3) Tolerance for sparse data. logic were developed to meet these challenges. In the paper they are compared with Neural Networks and other Alternative relational (symbolic) machine learning benchmark methods. The comparison shows several methods had shown their effectiveness in robotics advantages of relational methods. (navigation, 3-dimensional scene analysis) and drug design (selection of the most promising components for drug design). . In practice, learning systems based on first-order 1. Problem definition and objectives representations have been successfully applied to many problems in engineering, chemistry, physics, medicine. Performance of intelligent systems such as intelligent finance and other fields [1,2]. Traditionally symbolic control systems can be significantly improved if control methods are used in the areas with a lot of non-numeric signals are generated using prediction of future behavior of (symbolic) knowledge. In robot navigation this is relative the controlled system. In this study we assume that at each location of obstacles (on the right, on the left and so on). moment t performance of the system is measured by gain We discuss the key algorithms and theory that form the function G(t,yt,yt-1,u), were yt and yt-1 are system’s states core of symbolic machine learning methods for in t and previous moment t-1, respectively and u is control applications with dominating numerical data. The signal at the same moment t. Implementing this approach mathematical formalisms of first order logic rules requires discovering regularities in system’s behavior and described in [1,3,4] are used. Note that a class of general computing y ~ , predicted values of system’s state using t propositional and first-order logic rules, covered by discovered regularities. relational methods is wider than a class of decision trees [1, pp. 274-275]. Data mining methods are design to for discovering hidden regularities in databases. Data mining has two major sources to infer regularities: database and machine learning Specifically Relational hybrid Data Mining (RHDM) technologies. The field of machine learning is concerned combines inductive logic programming (ILP) with with the question of how to construct computer programs probabilistic inference. The combination benefits from
  • 2. noise robust probabilistic inference and highly expressive rely only on them then there are more chances that these and understandable first-order logic rules employed in ILP. rules will not deliver a right forecast on other data. Relational Data Mining technology is a data modeling What is the motivation to use suggested MMDR method in algorithm that does not assume the functional form of particular? MMDR uses hypothesis/rule generation and the relationship being modeled a priori. It can selection process, based on fundamental representative automatically consider a large number of inputs (e.g., time measurement theory [3] The original challenge for MMDR series characterization parameters) and learn how to was the simulation of discovering scientific laws from combine these to produce estimates for future values of a empirical data in chemistry and physics. There is a well- specific output variable. known difference between “black box” models and fundamental models (laws) in modern physics. The last Relational data mining (RDM) has roots in logic ones have much longer life, wider scope and a solid programming which provides the solid theoretical basis for background. Mitchell in [1] noted the importance that RDM. On the other hand at present existing Inductive relational assertions “can be conveniently expressed using Logic Programming systems representing RDM are first-order representations, while they are very difficult to relatively inefficient and have rather limited facilities for describe using propositional representations”. handling numerical data [5]. We developed a hybrid ILP and probabilistic technique (MMDR method) that handles Many well-known rule learners such as AQ, CN2 are numerical data efficiently [2,6]. propositional [1, p.279, 283]. Note that decision tree methods represent a particular type of propositional One of the main advantages of RDM over attribute-based representation [1, p.275]. Therefore decision tree methods learning is ILP’s generality of representation for such as ID3 and its successor C4.5 fit better to tasks background knowledge. This enables the user to provide, without relational assertions. Mitchell argues and gives in a more natural way, domain-specific background examples that propositional representations offer no knowledge to be used in learning. The use of background general way to describe the essential relations among the knowledge enables the user both to develop a suitable values of the attributes [1, pp. 283-284]. Below we follow problem representation and to introduce problem-specific his simple illustrative example. In contrast with constraints into the learning process. By contrast, attribute- propositional rules, a program using first-order based learners can typically accept background knowledge representations could learn the following general rule: in rather limited form only [5]. IF Father(x,y) & Female(y), THEN Daugher(x,y), 2. Relational methods where x and y are variables that can be bound to any person. For the target concept Daughter1,2 propositional A machine learning type of method, called Machine rule learner such as CN2 or C4.5, the result would be a Methods for Discovering Regularities (MMDR) is applied collection of very specific rules such as for forecasting time series. Then this forecast is used to compute control signals to get a better value of gain IF (Father1=Bob)&Name2=Bob)&Female1=True) THEN function G. Daughter1,2=True. MMDR method expresses patterns in first order logic and Although it is correct, this rule is so specific that it will assigns probabilities to rules generated by composing rarely, if ever, be useful in classifying future pairs of patterns. As any technique based on first order logic, people [4, pp.283-284]. It is obvious that similar problem MMDR allows one to get human-readable forecasting exists for autoregression methods like ARIMA and Neural rules [1, ch. 10], i.e. understandable in ordinary language Networks methods. First-order logic rules have an in addition to the forecast. A field expert can evaluate the advantage in discovering relational assertions because they performance of the forecast as well as a forecasting rule. capture relations directly, e.g., Father(x,y) in the example above. Also, as any technique based on probabilistic estimates, this technique delivers rules tested on their statistical In addition, first order rules allow one to express naturally significance. Statistically significant rules have advantage the other more general hypotheses not only the relation in comparison with rules tested only for their performance between pairs of attributes [3]. These more general rules on training and test data [1, ch. 5]. Training and testing can be as for classification problems as for an interval data can be too limited and/or not representative. If rules
  • 3. forecast of continuous variable. Moreover these rules are The next critical issue in applying data-driven forecasting able to catch Markov chain type of models used for time systems is generalization. The "Discovery" system series forecast. We share Mitchell’s opinion about the developed as software implementation MMDR [6] importance of algorithms designed to learn sets of first- generalizes data through “lawlike” logical probabilistic order rules that contain variables because first-order rules rules. Discovered rules have similar statistical estimate are much more expressive than propositional rules” [1, and significance on training and test sets of studied time p.274]. series. Theoretical advantages of MMDR generalization are presented in [6, 1]. What is the difference between MMDR and other Machine Learning methods dealing with first-order logic [1,4,5]? 3. Method for discovering regularities From our viewpoint the main accent in other first-order methods is on two computational complexity issues: how Figure 1 describes the steps of MMDR. On the first step wide is the class of hypotheses tested by the particular we select and/or generate a class first–order logic rules machine learning algorithms and how to construct a suitable for a particular task. learning algorithm to find deterministic rules. The emphasis of MMDR is on probabilistic first-order rules The next step is learning the particular first-order logic and measurement issues, i.e., how we can move from a rules using available training data. Then we test first-order real measurement to first-order logic representation. This is logic rules on training and test data using Fisher statistical a non-trivial task [3]. For example, how to represent criterion. After that we select temperature measurement in terms of first-order logic MMDR models without losing the essence of the attribute (temperature in (selecting/generating logical rules this case) and without inputting unnecessary conventional with variables x,y,..,z: properties? For instance, Fahrenheit and Celsius zeros of IF A(x,y,…,z)THEN B(x,y,…,z) temperature are our conventions in contrast with Kelvin scale where the zero is a real physical zero. There are no Learning logical rules on training Testing and selecting temperatures less than this zero. Therefore incorporating data using conditional logical rules (Occam’s properties of the Fahrenheit zero into first-order rules may probabilities of inference razor, Fisher criterion) force us to discover/learn properties of this convention P(B(x,y,…,z)/A(x,y,…z)) along with more significant scale invariant forecasting rules. Learning algorithms in the space with those kind of Creating interval and threshold accidental properties may be very time consuming and forecasts using rules IF A(x,y,…,z) THEN B(x,y,…,z) may produce inappropriate rules. and p-quintiles It is well known that the general problem of rule Figure 1. Flow diagram for MMDR: steps and technique applied generating and testing is NP-complete. Therefore the discussion above is closely related to the following statistically significant rules and apply Occam’s razor questions. What determines the number of rules and when principle: prefer the simplest hypothesis (rules) that fits the to stop generating rules? What is the justification for data [1, p. 65]. Simultaneously we use the rules’ specifying particular expressions instead of any other performance on training and test data for their selection. expressions? Using the approach from [3] we select rules We may iterate back and forth among these three steps which are simplest and consistent with measurement scales several times to find the best rules. The last step is creating for a particular task. The algorithm stops generating new interval and threshold forecasts using selected first-order rules when they become too complex (i.e., statistically logic rules: insignificant for the data) in spite of possible high accuracy on training data. The obvious other stop criterion is time IF A(x,y,…,z) THEN B(x,y,…,z). limitation. Detailed discussion about a mechanism of initial rule selection from measurement theory [3] As we mentioned above conceptually law-like rules came viewpoint is out of the scope of this paper. A special study from philosophy of science. These rules attempt to may result in a catalogue of initial rules/hypotheses to be mathematically capture the essential features of scientific tested (learned) for particular applications. In this way any laws: field analyst can choose rules to be tested without (1) High level of generalization; generating them. (2) Simplicity (Occam’s razor); and, (3) Refutability.
  • 4. The first feature -- generalization -- means that any other “Law-like” rules defined in this way hold all three regularity covering the same events would be less general, mentioned above properties of scientific laws: generality, i.e., applicable only to the part of events covered by the simplicity, and refutability.. law-like regularity. The second feature – simplicity-- The “Discovery” software searches all chains reflects the fact that a law-like rule is shorter than other rules. A law-like rule, R1 is more refutable than another C1 , C2, …, Cm-1, Cm rule R2 if there are more testing examples which refute R1 than R2, but the testing examples fail to refute R1. of nested “law-like” subrules, where C1 is a subrule of rule C2 , C1 = sub(C2), C2 is a subrule of rule C3, C2 = sub(C3) Formally, we present an IF-THEN rule C as and finally Cm-1 is a subrule of rule Cm, Cm-1 = sub(Cm). In addition, they satisfy an important property: A1& …&Ak ⇒ A0, Prob(C1) < Prob(C2), … , Prob(Cm-1) < Prob(Cm). where the IF-part, A1&...&Ak, consists of true/false logical statements A1,…,Ak ,and the THEN-part consists of a This property is the base for the following single logical statement A0. Each statements Ai is a given theorem [6]: refutable statements or its negations, which are also refutable. Rule C allows us to generate sub-rules with a All rules, which have a maximum value of truncated IF part, e.g. conditional probability, can be found at the end of such chains. A1&A2 ⇒ A0 , A1&A2&A3 ⇒ A0 This theorem basically means that the MMDR algorithm and so on. For rule C its conditional probability does not miss the best rules. The algorithm stops generating new rules when they become too complex (i.e., Prob(C) = Prob(A0/A1&...&Ak) statistically insignificant for the data) even if the rules are highly accurate on training data. The Fisher statistical is defined. Similarly conditional probabilities criterion is used in this algorithm for testing statistical significance. The obvious other stop criterion is time Prob(A0/Ai1&...&Aih) limitation. are defined for sub-rules Ci of the form 4. Performance Ai1& …&Aih ⇒ A0. We consider a control task with three control actions u=1, u=0 and u=-1, set as follows: We use conditional probability  1, if y ~ > y t t -1   ~ Prob(C) = Prob(A0/A1&...&Ak) u =  0, if y t = y t -1  ~ for estimating forecasting power of the rule to predict A0. - 1, if y t < y t -1  where yt-1 is an actual value of the time series for t-1 The goal is to find ‘law-like” rules. The rule is “law-like” moment and y ~ is the forecasted value of y for t moment. t if and only if all of its sub-rules have less conditional Forecasted values are generated by all four studied probability than the rule, and statistical significance of methods including Neural Network method and relational that is established. Each sub-rule Ci generalizes rule C, MMDR method. i.e., potentially Ci is true for larger set of instances. Another definition of “law-like” rules can be stated in The gain function G is computed for every t after all actual terms of generalization. values of yt became known. Gain function G(t,u,yt ,yt-1) depends on u and actual values of output yt ,yt-1: The rule is “law-like” if and only if it can not be generalized without producing a statistically significant reduction in its conditional probability.
  • 5.  y ~ − y t -1 , if u = 1 t   5. Conclusion G(t, u, y t , y t -1 ) =  0, if u = 0  ~ In average human-readable and understadable regularities - (y t − y t -1 ), if u = −1  generated by the relational method MMDR outperformed other methods including widely used neural networks The total gain is defined as (table 1). Gtotal=Σt=1,n G(t,u, yt ,yt-1) 6. References The adaptive linear method in essence works according to ~ the following fomula: y t =ay t-1+ byt-2 +c. 1. Mitchell T. “Machine Learning”, McGraw-Hill, NY, 1997 “Passive control” method ignores forecast yt and in all 2. Kovalerchuk, B., Vityaev E. “Discovering Law-like cases generates the same control signal “do nothing”: Regularities in Financial Time Series”, Journal of Computational Intelligence in Finance, Vol.6, No.3, If (yt ~>yt-1 or yt ~≤yt-1) then u=0 (do nothing) 1998, pp.12-26, Table 1 shows performance of MMDR in conparison with 3. Krantz D.H., Luce R.D., Suppes P., and Tversky A. three other methods: Adaptive Linear, Passive Contol, and “Foundations of measurement”. vol.1-3, Acad. Press, Neural Networks. NY, London, 1971, 1989, 1990 4. Russel S., Norvig P. “Artificial Intelligence. A modern Table 1. Simulated gain approach”, Prentice Hall, 1995 5. Bratko, I., Muggleton. S. “Applications of Method Gain (%) inductive logic programming”, Communications Data Data Average set 1 set 2 (two data of ACM, vol.38, N. 11,1995, pp.65-70. (477 (424 sets, 901 6. Vityaev E., Moskvitin A. “Introduction to Discovery instances) instances) instances) theory: Discovery system”, Computational Systems, Adaptive Linear 21.9 18.28 20.09 v.148, Novosibirsk, pp.117-163, 1993 (in Russian). MMDR 26.69 43.83 35.26 Passive control 30.39 20.56 25.47 Neural Network 18.94 16.07 17.5