SlideShare una empresa de Scribd logo
1 de 57
Descargar para leer sin conexión
Knowledge Discovery for the Semantic Web
under the Data Mining Perspective
Claudia d'Amato
Department of Computer Science
University of Bari
Italy
Knowledge Disovery: Definition
Knowledge Discovery (KD)
“the process of automatically searching large volumes of data
for patterns that can be considered knowledge about the
data” [Fay'96]
Patterns need to be:

New – Hidden in the data

Useful

Understandable
What is a Pattern and Knowldge?
Pattern
expression E in a given language L describing a subset FE
of
facts F.
E is called pattern if it is simpler than enumerating facts in FE
Knowledge
awareness or understanding of facts, information, descriptions,
or skills, which is acquired through experience or education
by perceiving, discovering, or learning
Knowledge Discovery
and Data Minig

KD is often related with Data Mining (DM) field

DM is one step of the "Knowledge Discovery in Databases"
process (KDD)[Fay'96]

DM is the computational process of discovering patterns in
large data sets involving methods at the intersection of
artificial intelligence, machine learning, statistics, and
databases.

DM goal: extracting information from a data set and
transforming it into an understandable
structure/representation for further use
What is not DM
Not all information discovery tasks are considered to be DM.

Looking up individual records using a DBMS

Finding particular Web pages via a query to a search engine
Tasks releted to the Information Retrieval (IR)
Nonetheless, DM techniques can been used to improve IR
systems

e.g. to create index structures for efficiently organizing and
retrieving information
The KDD process
Input
Data
Data Preprocessing
and Transformation
Data Mining
Interpretation
and
Evaluation
Information/
Taking Action
Data fusion (multiple sources)
Data Cleaning (noise,missing val.)
Feature Selection
Dimentionality Reduction
Data Normalization
The most labourous and
time consuming step
Filtering Patterns
Visualization
Statistical Analysis
- Hypothesis testing
- Attribute evaluation
- Comparing learned models
- Computing Confidence Intervals
CRISP-DM (Cross Industry Standard Process for Data
Mining) alternative process model developed by a
consortium of several companies
All data mining methods use induction-based learing
The knowledge
gained at the end of
the process is given
as a model/data
generalization
The KDD process
Input
Data
Data Preprocessing
and Transformation
Data
Mining
Interpretation
and
Evaluation
Information/
Taking Action
Data fusion (multiple sources)
Data Cleaning (noise,missing val.)
Feature Selection
Dimentionality Reduction
Data Normalization
The most labourous and
time consuming step
Filtering Patterns
Visualization
Statistical Analysis
- Hypothesis testing
- Attribute evaluation
- Comparing learned models
- Computing Confidence Intervals
CRISP-DM (Cross Industry Standard Process for Data
Mining) alternative process model developed by a
consortium of several companies
All data mining methods use induction-based learing
The knowledge
gained at the end of
the process is given
as a model/data
generalization
Data Mining Tasks...

Predictive Tasks: predict the value of a particular attribute
(called target or dependent variable) based on the value of
other attributes (called explanatory or independent
variables)
Goal: learning a model that minimizes the error between the
predicted and the true values of the target variable

Classification → discrete target variables

Regression → continuous target variables
...Data Mining Tasks...
Examples of Classification tasks

Develop a profile of a “successfull” person

Predict customers that will respond to a marketing
compain
Examples of Regression tasks

Forecasting the future price of a stock
… Data Mining Tasks...

Descriptive tasks: discover patterns (correlations, clusters,
trends, trajectories, anomalies) summarizing the underlying
relationship in the data

Association Analysis: discovers (the most interesting)
patterns describing strongly associated features in the
data/relationships among variables

Cluster Analysis: discovers groups of closely related
facts/observations. Facts belonging to the same cluster
are more similar each other than observations
belonging other clusters
...Data Mining Tasks...
Examples of Association Analysis tasks

Market Basket Analysis

Discoverying interesting relationships among retail
products. To be used for:

Arrange shelf or catalog items

Identify potential cross-marketing strategies/cross-
selling opportunities
Examples of Cluster Analysis tasks

Automaticaly grouping documents/web pages with
respect to their main topic (e.g. sport, economy...)
… Data Mining Tasks

Anomaly Detection: identifies facts/observations
(Outlier/change/deviation detection) having
characteristics significantly different from the rest of the
data. A good anomaly detector has a high detection rate
and a low false alarm rate.
• Example: Determine if a credit card purchase is
fraudolent → Imbalance learning setting
Approaches:

Supervised: build models by using input attributes to predict
output attribute values

Unsupervised: build models/patterns without having any
output attributes
The KDD process
Input
Data
Data Preprocessing
and Transformation
Data Mining
Interpretation
and
Evaluation
Information/
Taking Action
Data fusion (multiple sources)
Data Cleaning (noise,missing val.)
Feature Selection
Dimentionality Reduction
Data Normalization
The most labourous and
time consuming step
Filtering Patterns
Visualization
Statistical Analysis
- Hypothesis testing
- Attribute evaluation
- Comparing learned models
- Computing Confidence Intervals
CRISP-DM (Cross Industry Standard Process for Data
Mining) alternative process model developed by a
consortium of several companies
All data mining methods use induction-based learing
The knowledge
gained at the end of
the process is given
as a model/ data
generalization
A closer look at the Evalaution step
Given

DM task (i.e. Classification, clustering etc.)

A particular problem for the chosen task
Several DM algorithms can be used to solve the problem
1) How to assess the performance of an algorithm?
2) How to compare the performance of different
algorithms solving the same problem?
Evaluating the Performance of an Algorithm
Assessing Algorithm Performances
Components for supervised learning [Roiger'03]
Test data missing in unsupervised setting
Instances
Attributes
Data
Training
Data
Test Data
Model
Builder
Supervised
Model
Evaluation
Parameters
Performance
Measure
(Task Dependent)
Examples of Performace Measures

Classification → Predictive Accuracy

Regression → Mean Squared Error (MSE)

Clustering → Cohesion Index

Association Analysis → Rule Confidence

….....
Supervised Setting: Building Training and Test Set
Necessary to predict performance bounds based with whatever
data (independent test set)

Split data into training and test set

The repeated and stratified k-fold cross-validation is
the most widly used technique

Leave-one-out or bootstrap used for small datasets

Make a model on the training set and evaluate it out on the
test set [Witten'11]

e.g. Compute predictive accuracy/error rate
K-Fold Cross-validation (CV)

First step: split data into k subsets of equal size

Second step: use each subset in turn for testing, the
remainder for training

Subsets often stratified → reduces variance

Error estimates averaged to yield the overall error
estimate

Even better: repeated stratified cross-validation

E.g. 10-fold cross-validation is repeated 15 times
and results are averaged → reduces the variance
Test set step 1 Test set step 2 …..........
Leave-One-Out cross-validation

Leave-One-Out:
a particular form of cross-validation:

Set number of folds to number of training instances

I.e., for n training instances, build classifier n times

The results of all n judgement are averaged for
determining the final error estimate

Makes best use of the data for training

Involves no random subsampling

There's no point in repeating it → the same result will be
obtained each time
The bootstrap

CV uses sampling without replacement

The same instance, once selected, cannot be selected
again for a particular training/test set

Bootstrap uses sampling with replacement

Sample a dataset of n instances n times with
replacement to form a new dataset

Use this new dataset as the training set

Use the remaining instances not occurting in the
training set for testing

Also called the 0.632 bootstrap → The training data
will contain approximately 63.2% of the total instances
Estimating error
with the bootstrap
The error estimate of the true error on the test data will be
very pessimistic

Trained on just ~63% of the instances

Therefore, combine it with the resubstitution error:

The resubstitution error (error on training data) gets less
weight than the error on the test data

Repeat the bootstrap procedure several times with different
replacement samples; average the results
Comparing Algorithms Performances
For Supervised Aproach
Comparing Algorithms Performance
Frequent question: which of two learning algorithms performs
better?
Note: this is domain dependent!
Obvious way: compare the error rates computed by the use of
k-fold CV estimates
Problem: variance in estimate on a single 10-fold CV
Variance can be reduced using repeated CV
However, we still don’t know whether the results are reliable
Significance tests

Significance tests tell how confident we can be that there
really is a difference between the two learning algorithms

Statistical hypothesis test exploited → used for testing a
statistical hypothesis

Null hypothesis: there is no significant (“real”)
difference (between the algorithms)

Alternative hypothesis: there is a difference

Measures how much evidence there is in favor of rejecting
the null hypothesis for a specified level of significance
– Compare two learning algorithms by comparing
e.g. the average error rate over several cross-
validations (see [Witten'11] for details)
DM methods and SW:
A closer Look
DM methods and SW: a closer look

Classical DM algorithms originally developed for
propositional representations

Some upgrades to (multi-)relational and graph
representations defined
Semantic Web: characterized by

Rich/expressive representations (RDFS, OWL)
– How to cope with them when applying DM algorithms?

Open world Assumpion (OWA)
– DM algorithms grounded on CWA
– Are metrics for classical DM tasks still applicable?
Exploiting DM methods in SW...

Approximate inductive instance retrieval

assess the class membership of the individuals in a KB
w.r.t. a query concept [Fanizzi'12]

(Hyerarchical) Type prediciton
– Assess the type of instances in RDF datasets [Melo'14]

Link Prediction

Given an individual and a role R, predict the other
individuals a is in R relation with [Minervini'14]
Regarded as a classification problem → (semi-)automatic
ontology population
...Exploiting DM mthods in SW...

Automatic concept drift and novelty detection [Fanizzi'09]

change of a concept towards a more general/specific
one w.r.t. the evidence provided by new annotated
individuals

Ex.: almost all Worker work for more than 10 hours
per days → HardWorker

isolated cluster may require to be defined through new
emerging concepts to be added to ontology

Ex.: subset of Worker employed in a company →
Employee

Ex.: subset of Worker working for several
companies → Free-lance
Regarded as a (conceptual) clustering problem
...Exploiting DM mthods in SW

Semi-automatic ontology enrichment [d'Amato'10,Völker'11,
Völker'15,d'Amato'16]

exploiting the evidence coming from the data →
discovering hidden knowledge patterns in the form of
relational association rules

new axioms may be suggested → existing ontologies
can be extended
Regarded as a pattern discovery problem
Intro to the Research Task
Associative Analysis:
the Pattern Discovery Task
Problem Definition:
Given a dataset find

all possible hidden pattern in the form of Association Rule (AR)

having support and confidence greater than a minimum
thresholds
Definition: An AR is an implication expression of the form X → Y
where X and Y are disjoint itemsets
An AR expresses a co-occurrence relationship between the items
in the antecedent and the concequence not a causality relationship
Basic Definitions
 An itemset is a finite set of assignments of the form {A1
= a1
, …,
Am
= am
} where Ai
are attributes of the dataset and ai
the
corresponding values

The support of an itemset is the number of istances/tuples in the
dataset containing it.
Similarily, support of a rule is s(X → Y ) = |(X  Y)|;

The confidence of a rule provides how frequently items in the
consequence appear in instances/tuples containing the
antencedent
c(X → Y ) = |(X  Y)| / |(X)| (seen as p(Y|X) )
Discoverying Association Rules: General Approach
Articulated in two main steps [Agrawal'93, Tan'06]:
1. Frequent Patterns Generation/Discovery (generally in the form
of itemsets) wrt a minimum frequency (support) threshold

Apriori algortihm → The most well known
algorithm

the most expensive computation;
2. Rule Generation

Extraction of all the high-confidence association
rules from the discovered frequent patterns.
Apriori Algortihm: Key Aspects

Uses a level-wise generate-and-test approach

Grounded on the non-monotonic property of the support of an
itemset

The support of an itemset never exceeds the support of its
subsets

Basic principle:

if an itemset is frequent → all its subsets must also be
frequent

If an itemset is infrequent → all its supersets must be
infrequent too

Allow to sensibly cut the search space
Apriori Algorithm in a Nutshell
Goal: Finding the frequent itemsets ↔ the sets of items that
satisfying the min support threshold
Iteratively find frequent itemsets with lenght from 1 to k (k-itemset)
Given a set Lk-1
of frequent (k-1)itemset, join Lk-1
with itself to obain
Lk
the candidate k-itemsets
Prune items in Lk
that are not frequent (Apriori principle)
If Lk
is not empty, generate the next candidate (k+1)itemset until
the frequent itemset is empty
Apriori Algorithm: Example...
Suppose having the transaction table
(Boolean values considered for simplicity)
Apply APRIORI algorithm
ID List of Items
T1 {I1,I2,I5}
T2 {I2,I4}
T3 {I2,I3}
T4 {I1,I2,I4}
T5 {I1,I3}
T6 {I2,I3}
T7 {I1,I3}
T8 {I1,I2,I3,I5}
T9 {I1,I2,I3}
...Apriori Algorithm: Example...
Itemset Sup.
Count
{I1} 6
{I2} 7
{I3} 6
{I4} 2
{I5} 2
Itemset Sup.
Count
{I1} 6
{I2} 7
{I3} 6
{I4} 2
{I5} 2
Min. Supp. 2
Pruning
L1
Itemset Sup.
Count
{I1,I2} 4
{I1,I3} 4
{I1,I4} 1
{I1,I5} 2
{I2,I3} 4
{I2,I4} 2
{I2,I5} 2
{I3,I4} 0
{I3,I5} 1
{I4,I5} 0
L2
Min. Supp. 2
Pruning
Join for
candidate
generation
Output After Pruning
...Apriori Algorithm: Example
Itemset Prune
Infrequent
{I1,I2,I3} No
{I1,I2,I5} No
{I1,I2,I4} Yes {I1,I4}
{I1,I3,I5} Yes {I3,I5}
{I2,I3,I4} Yes {I3,I4}
{I2,I3,I5} Yes {I3,I5}
{I2,I4,I5} Yes {I4,I5}Output After Pruning
L4
Min.
Supp. 2
Pruning
Join for
candidate
generation
Itemset Sup.
Count
{I1,I2} 4
{I1,I3} 4
{I1,I5} 2
{I2,I3} 4
{I2,I4} 2
{I2,I5} 2
Apply Apriori
principle
Itemset Sup.
Count
{I1,I2,I3} 2
{I1,I2,I5} 2
Join for
candidate
generation
L3
Output After Pruning
Itemset Prune
Infrequent
{I1,I2,I3,I5} Yes {I3,I5}
Empty
Set
STOP
Generating ARs
from frequent itemsets

For each frequent itemset “I”
– generate all non-empty subsets S of I

For every non empty subset S of I
– compute the rule r := “S → (I-S)”

If conf(r) > = min confidence
– then output r
Genrating ARs: Example...
Given:
L = { {I1}, {I2}, {I3}, {I4}, {I5}, {I1,I2}, {I1,I3}, {I1,I5}, {I2,I3}, {I2,I4},
{I2,I5}, {I1,I2,I3}, {I1,I2,I5} }.
Let us fix 70% for the Minimum confidence threshold

Take l = {I1,I2,I5}.

All nonempty subsets are {I1,I2}, {I1,I5}, {I2,I5}, {I1}, {I2}, {I5}.
The resulting ARs and their confidence are:

R1: I1 AND I2 →I5
Conf(R1) = supp{I1,I2,I5}/supp{I1,I2} = 2/4 = 50% REJECTED
...Generating ARs: Example...
Min. Conf. Threshold 70%; l = {I1,I2,I5}.

All nonempty subsets are {I1,I2}, {I1,I5}, {I2,I5}, {I1}, {I2}, {I5}.
The resulting ARs and their confidence are:

R2: I1 AND I5 →I2
Conf(R2) = supp{I1,I2,I5}/supp{I1,I5} = 2/2 = 100% RETURNED

R3: I2 AND I5 → I1
Conf(R3) = supp{I1,I2,I5}/supp{I2,I5} = 2/2 = 100% RETURNED

R4: I1 → I2 AND I5
Conf(R4) = sc{I1,I2,I5}/sc{I1} = 2/6 = 33% REJECTED
...Genrating ARs: Example
Min. Conf. Threshold 70%; l = {I1,I2,I5}.

All nonempty subsets: {I1,I2}, {I1,I5}, {I2,I5}, {I1}, {I2}, {I5}.
The resulting ARs and their confidence are:

R5: I2 → I1 AND I5
Conf(R5) = sc{I1,I2,I5}/sc{I2} = 2/7 = 29% REJECTED

R6: I5 → I1 AND I2
Conf(R6) = sc{I1,I2,I5}/ {I5} = 2/2 = 100% RETURNED
Similarily for the other sets I in L (Note: it does not make sense to
consider an itemset made by just one element i.e. {I1} )
Identifying Representative Itemsets
When the number of discovered frequent itemsets is very high, it
could be useful to identify a representative set of itemsets from
which all other patters may be derived

Maximal Frequent Itemset: is a frequent itemset for which none
of its immediate supersets are frequent

Closed Frequent Itemset: is a frequent itemset for which none
of its immediate supersets has exactly its same support count
→ used for removing redundant rules
On improving Discovery of ARs
Apriori algorithm may degrade significantly for dense datasets
Alternative solutions:

FP-growth algorithm outperforms Apriori

Does not use the generate-and-test approach

Encodes the dataset in a compact data structure (FP-
Tree) and extract frequent itemsets directly from it

Usage of additional interenstingness metrics (besides support
and confidence) (see [Tan'06])

Lift, Interest Factor, correlation, IS Measure
Frequent Graph Patterns

Frequent Graph Patterns are subgraphs that are found from a
collection of graphs or a single massive graph, with a
frequency no less than a specifed support threshold
– Exploited for facilitating indexing and query processing

A graph g is a subgraph of another graph g' if there exists a
subgraph isomorphism from g to g' denoted by g  g'. g' is
called a supergraph of g
Discoverying Frequent Graph Patterns
Apriori-based and pattern-growth appraoches have been formally
defined
Problems:

Giving suitable definitions for support and conficence for
frequent subgraph mining problem

Even more complicate for the case of a single large graph
[see Aggarwal'10 sect. 2.5]

Developed appraoches revealed infeasible in practice
Graph Mining: Current Approaches
Methods for mining [Aggarwal'10, ch. 3, 4]:

Significant (optimal) subgraphs according to an objective
function

In a timely way by accessing only a small subset of
promising subgraphs

Representative (orthogonal) subgraphs by exploiting a notion
of similarity
Avoid generating the complete set of frequent subgraphs while
presenting only a set of interesting subgraph patterns
Pattern Discovery on RDF data sets for Making Predictions
Proposed frameworks for discoverying ARs from:
●
RDF data sets [Galarraga'13, Galarraga'15]
➢
Inspired to ILP appraoches for discovering ARs from clausal
representation
➢
Exploits discovered ARs for making new role preditions
➢
Takes into account the underlying OWA
➢
Proposes new metrics for evaluating the prediction results
considering the OWA
●
Populated Ontological knowldge bases [d'Amato'16]
●
Exploits the available background knowledge
●
Exploits deductive reasoning capabilities
●
Discovered ARs can make concept and role predictions
Research Task: Goal
Moving from [d'Amato'16], [Galarraga'15]
define a method for Discoverying ARs from ontological KBs that

Makes additional usage of the available ontological knowledge
and its underlying semantics (e.g. Using hierarchy of roles)

Takes advantage of solutions for improving the performances
of the method with respect to scalability. Possible directions:
– Heuristics for further cutting the search space
– Indexing methods for caching the results of the
inferences made by the reasoner
apply the formalized method to a knowledge graph generated as
output of the first part of the talk
Research Task: Possible Research Questions...
●
Can the formalized method be applied straightforwardly to a
knowledge graph generated as output of the first part of the
talk? Is there any gap that needs to be filled? If so, what is
such a gap?
●
Is OWA the right way to go? In case you move towards CWA,
what is the impact of such a choice in your method and its
evaluation?
●
Is the exploitation of a reasoner and a background knowledge a
value added or a bottleneck?
...Research Task: Possible Research Questions
●
Are the metrics proposed in the referenced papers enough? If
not:
➢
what are the aspects/effects/outputs that need to be
evalauted differently/further?
➢
what are the new/additional metrics that are necessary?
●
Is there any additional utility of the discovered rules
●
What is the chosen language for representing the discovered
rules? What is/are the motivation/s for it?
Research Task: Expected Output
4 Mins Presentation summarizing

[Galarraga'15/d'Amato'16] approach(es)
– only one group → 4 additional mins presentation
– Randomly decided when working on the research task

Proposed solution and its value added/advace with respect to
the references above

Replies to the proposed/new research questions, if any.

How do you plan to prove the value added of your proposal
Research Task: Groups Formation and Rules
●
Groups have to be composed of 7 members
●
Each group should has the following chracterstics
●
Members are different from the group for the mini-project
●
An overlap of maximum 2 members with respect to
the mini-project groups is allowed
●
Students that already know each other from preivous
experineces should belong to different groups
DON'T BE SHY
THERE ARE NO RIGHT OR WRONG ANSWERS
THIS IS THE TIME TO LEARN FROM INTERACTION
BE CREATIVE AND DON'T LIMIT YOURSELF
References...

[Fay'96] U. Fayyad, G. Piatetsky-Shapiro, P. Smyth. From
Data Mining to Knowledge Discovery: An Overview. Advances
in Knowledge Discovery and Data Mining, MIT Press, 1996.

[Agrawal'93] R. Agrawal, T. Imielinski, and A. N. Swami.
Mining association rules between sets of items in large
databases. Proc. of Int. Conf. on Management of Data, p.
207–216. ACM, 1993

[d'Amato'10] C. d'Amato, N. Fanizzi, F. Esposito: Inductive
learning for the Semantic Web: What does it buy? Semantic
Web 1(1-2): 53-59 (2010)

[Völker'11] J. Völker, M. Niepert: Statistical Schema Induction.
ESWC (1) 2011: 124-138
...References...
●
[Völker'15] J. Völker, D. Fleischhacker, H. Stuckenschmidt:
Automatic acquisition of class disjointness. J. Web Sem. 35: 124-
139 (2015)

[d'Amato'16] C. d'Amato, S. Staab, A.G.B. Tettamanzi, T. Minh,
F.L. Gandon. Ontology enrichment by discovering multi-relational
association rules from ontological knowledge bases. SAC 2016:
333-338

[Tan'06] P.N. Tan, M. Steinbach, V. Kumar. Introduction to Data
Mining. Ch. 6 Pearson, 2006 . http://www-
users.cs.umn.edu/~kumar/dmbook/ch6.pdf

[Aggarwal'10] C. Aggarwal, H. Wang. Managing and Mining
Graph Data. Springer, 2010
...References...
●
[Witten'11] I.H. Witten, E. Frank. Data Mining: Practical Machine
Learning Tool and Techiques with Java Implementations. Ch. 5.
Morgan-Kaufmann, 2011 (3rd
Edition)
●
[Fanizzi'12] N. Fanizzi, C. d'Amato, F. Esposito: Induction of
robust classifiers for web ontologies through kernel machines. J.
Web Sem. 11: 1-13 (2012)
●
[Minervini'14] P. Minervini, C. d'Amato, N. Fanizzi, F. Esposito:
Adaptive Knowledge Propagation in Web Ontologies. Proc. of
EKAW Conferece. Springer. pp. 304-319, 2014.

[Roiger'03] R.J. Roiger, M.W. Geatz. Data Mining. A Tutorial-
Based Primer. Addison Wesley, 2003
...References
●
[Melo'16] A. Melo, H. Paulheim, J. Völker. Type Prediction in RDF
Knowledge Bases Using Hierarchical Multilabel Classification.
WIMS 2016: 14
●
[Galárraga'13] L. Galárraga, C. Teflioudi, F. Suchanek, K. Hose.
AMIE: Association Rule Mining under Incomplete Evidence in
Ontological Knowledge Bases. Proc. of WWW 2013.
http://luisgalarraga.de/docs/amie.pdf

[Fanizzi'09] N. Fanizzi, C. d'Amato, F. Esposito: Metric-based
stochastic conceptual clustering for ontologies. Inf. Syst. 34(8):
792-806, 2009

[Galárraga'15] L. Galárraga, C. Teflioudi, F. Suchanek, K. Hose.
Fast Rule Mining in Ontological Knowledge Bases with AMIE+.
VLDB Journal 2015.
http://suchanek.name/work/publications/vldbj2015.pdf

Más contenido relacionado

La actualidad más candente

Survey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data MiningSurvey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data Miningijsrd.com
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalitiesRajendran
 
Data preparation and processing chapter 2
Data preparation and processing chapter  2Data preparation and processing chapter  2
Data preparation and processing chapter 2Mahmoud Alfarra
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introductionDr-Dipali Meher
 
Data Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendData Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendSalah Amean
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data miningUjjawal
 
An efficient data preprocessing method for mining
An efficient data preprocessing method for miningAn efficient data preprocessing method for mining
An efficient data preprocessing method for miningKamesh Waran
 
Introduction to Datamining Concept and Techniques
Introduction to Datamining Concept and TechniquesIntroduction to Datamining Concept and Techniques
Introduction to Datamining Concept and TechniquesSơn Còm Nhom
 
Recommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceRecommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceIJDKP
 
knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)Kartik Kalpande Patil
 
Data miningppt378
Data miningppt378Data miningppt378
Data miningppt378nitttin
 
Research trends in data warehousing and data mining
Research trends in data warehousing and data miningResearch trends in data warehousing and data mining
Research trends in data warehousing and data miningEr. Nawaraj Bhandari
 
Data Mining with SQL Server 2008
Data Mining with SQL Server 2008Data Mining with SQL Server 2008
Data Mining with SQL Server 2008Peter Gfader
 
Introduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in DatabaseIntroduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in DatabaseKartik Kalpande Patil
 
TTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining TechniqueTTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining TechniqueMehmet Beyaz
 
Application of data mining tools for
Application of data mining tools forApplication of data mining tools for
Application of data mining tools forIJDKP
 

La actualidad más candente (19)

Data Mining
Data MiningData Mining
Data Mining
 
Survey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data MiningSurvey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data Mining
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalities
 
Data preparation and processing chapter 2
Data preparation and processing chapter  2Data preparation and processing chapter  2
Data preparation and processing chapter 2
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introduction
 
Data Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendData Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trend
 
Seminar Presentation
Seminar PresentationSeminar Presentation
Seminar Presentation
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data mining
 
An efficient data preprocessing method for mining
An efficient data preprocessing method for miningAn efficient data preprocessing method for mining
An efficient data preprocessing method for mining
 
Introduction to Datamining Concept and Techniques
Introduction to Datamining Concept and TechniquesIntroduction to Datamining Concept and Techniques
Introduction to Datamining Concept and Techniques
 
Recommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceRecommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduce
 
Ghhh
GhhhGhhh
Ghhh
 
knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)knowledge discovery and data mining approach in databases (2)
knowledge discovery and data mining approach in databases (2)
 
Data miningppt378
Data miningppt378Data miningppt378
Data miningppt378
 
Research trends in data warehousing and data mining
Research trends in data warehousing and data miningResearch trends in data warehousing and data mining
Research trends in data warehousing and data mining
 
Data Mining with SQL Server 2008
Data Mining with SQL Server 2008Data Mining with SQL Server 2008
Data Mining with SQL Server 2008
 
Introduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in DatabaseIntroduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in Database
 
TTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining TechniqueTTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining Technique
 
Application of data mining tools for
Application of data mining tools forApplication of data mining tools for
Application of data mining tools for
 

Destacado

Knowledge Patterns SSSW2016
Knowledge Patterns SSSW2016Knowledge Patterns SSSW2016
Knowledge Patterns SSSW2016Aldo Gangemi
 
Data Mining for Libraries
Data Mining for LibrariesData Mining for Libraries
Data Mining for LibrariesElaine Lasda
 
Demo: Profiling & Exploration of Linked Open Data
Demo: Profiling & Exploration of Linked Open DataDemo: Profiling & Exploration of Linked Open Data
Demo: Profiling & Exploration of Linked Open DataStefan Dietze
 
Search Engines After The Semanatic Web
Search Engines After The Semanatic WebSearch Engines After The Semanatic Web
Search Engines After The Semanatic Websamar_slideshare
 
SemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorialSemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorialPeter Mika
 
PhD Dissertation Supporting tools for automated generation and visual editing...
PhD Dissertation Supporting tools for automated generation and visual editing...PhD Dissertation Supporting tools for automated generation and visual editing...
PhD Dissertation Supporting tools for automated generation and visual editing...Álvaro Sicilia
 
Starship, Building Intelligent Delivery Robots
Starship, Building Intelligent Delivery RobotsStarship, Building Intelligent Delivery Robots
Starship, Building Intelligent Delivery RobotsAndré Karpištšenko
 
Knowledge discovery in social media mining for market analysis
Knowledge discovery in social media mining for market analysisKnowledge discovery in social media mining for market analysis
Knowledge discovery in social media mining for market analysisSenuri Wijenayake
 
In Search of a Semantic Book Search Engine: Are We There Yet?
In Search of a Semantic Book Search Engine: Are We There Yet?In Search of a Semantic Book Search Engine: Are We There Yet?
In Search of a Semantic Book Search Engine: Are We There Yet?Irfan Ullah
 
WOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of ThingsWOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of ThingsAndreas Kamilaris
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryFraboni Ec
 
Semantic Search Over The Web
Semantic Search Over The WebSemantic Search Over The Web
Semantic Search Over The Webalierkan
 
Knowledge Discovery in Databases
Knowledge Discovery in DatabasesKnowledge Discovery in Databases
Knowledge Discovery in DatabasesDiwas Kandel
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data MiningAmritanshu Mehra
 
Knowledge Discovery using an Integrated Semantic Web
Knowledge Discovery using an Integrated Semantic WebKnowledge Discovery using an Integrated Semantic Web
Knowledge Discovery using an Integrated Semantic WebMichel Dumontier
 

Destacado (20)

Knowledge Patterns SSSW2016
Knowledge Patterns SSSW2016Knowledge Patterns SSSW2016
Knowledge Patterns SSSW2016
 
Data Mining for Libraries
Data Mining for LibrariesData Mining for Libraries
Data Mining for Libraries
 
Demo: Profiling & Exploration of Linked Open Data
Demo: Profiling & Exploration of Linked Open DataDemo: Profiling & Exploration of Linked Open Data
Demo: Profiling & Exploration of Linked Open Data
 
Search Engines After The Semanatic Web
Search Engines After The Semanatic WebSearch Engines After The Semanatic Web
Search Engines After The Semanatic Web
 
A Survey of Entity Ranking over RDF Graphs
A Survey of Entity Ranking over RDF GraphsA Survey of Entity Ranking over RDF Graphs
A Survey of Entity Ranking over RDF Graphs
 
Knowledge Discovery in Production
Knowledge Discovery in ProductionKnowledge Discovery in Production
Knowledge Discovery in Production
 
SemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorialSemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorial
 
PhD Dissertation Supporting tools for automated generation and visual editing...
PhD Dissertation Supporting tools for automated generation and visual editing...PhD Dissertation Supporting tools for automated generation and visual editing...
PhD Dissertation Supporting tools for automated generation and visual editing...
 
School intro
School introSchool intro
School intro
 
Starship, Building Intelligent Delivery Robots
Starship, Building Intelligent Delivery RobotsStarship, Building Intelligent Delivery Robots
Starship, Building Intelligent Delivery Robots
 
Knowledge discovery in social media mining for market analysis
Knowledge discovery in social media mining for market analysisKnowledge discovery in social media mining for market analysis
Knowledge discovery in social media mining for market analysis
 
In Search of a Semantic Book Search Engine: Are We There Yet?
In Search of a Semantic Book Search Engine: Are We There Yet?In Search of a Semantic Book Search Engine: Are We There Yet?
In Search of a Semantic Book Search Engine: Are We There Yet?
 
WOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of ThingsWOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of Things
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Semantic Search Over The Web
Semantic Search Over The WebSemantic Search Over The Web
Semantic Search Over The Web
 
Knowledge Discovery in Databases
Knowledge Discovery in DatabasesKnowledge Discovery in Databases
Knowledge Discovery in Databases
 
Business-IT Alignment
Business-IT AlignmentBusiness-IT Alignment
Business-IT Alignment
 
Factors Influencing Knowledge Management
Factors Influencing Knowledge ManagementFactors Influencing Knowledge Management
Factors Influencing Knowledge Management
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining
 
Knowledge Discovery using an Integrated Semantic Web
Knowledge Discovery using an Integrated Semantic WebKnowledge Discovery using an Integrated Semantic Web
Knowledge Discovery using an Integrated Semantic Web
 

Similar a Tutorial Knowledge Discovery

Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
Presentation Title
Presentation TitlePresentation Title
Presentation Titlebutest
 
Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dmsumit621
 
Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2Gokulks007
 
Data Reduction
Data ReductionData Reduction
Data ReductionRajan Shah
 
BI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business businessBI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business businessJawaherAlbaddawi
 
Pandas Data Cleaning and Preprocessing PPT.pptx
Pandas Data Cleaning and Preprocessing PPT.pptxPandas Data Cleaning and Preprocessing PPT.pptx
Pandas Data Cleaning and Preprocessing PPT.pptxbajajrishabh96tech
 
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxLesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxcloudserviceuit
 
Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)Jeet Das
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10Roger Barga
 
Data Mining methodology
 Data Mining methodology  Data Mining methodology
Data Mining methodology rebeccatho
 
Introduction
IntroductionIntroduction
Introductionbutest
 
Introduction
IntroductionIntroduction
Introductionbutest
 
Introduction
IntroductionIntroduction
Introductionbutest
 

Similar a Tutorial Knowledge Discovery (20)

Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
Presentation Title
Presentation TitlePresentation Title
Presentation Title
 
Part1
Part1Part1
Part1
 
Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dm
 
Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2
 
Data Reduction
Data ReductionData Reduction
Data Reduction
 
Data .pptx
Data .pptxData .pptx
Data .pptx
 
Chapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data MiningChapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data Mining
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Talk
TalkTalk
Talk
 
BI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business businessBI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business business
 
Experimenting with Data!
Experimenting with Data!Experimenting with Data!
Experimenting with Data!
 
Pandas Data Cleaning and Preprocessing PPT.pptx
Pandas Data Cleaning and Preprocessing PPT.pptxPandas Data Cleaning and Preprocessing PPT.pptx
Pandas Data Cleaning and Preprocessing PPT.pptx
 
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxLesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
 
Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
 
Data Mining methodology
 Data Mining methodology  Data Mining methodology
Data Mining methodology
 
Introduction
IntroductionIntroduction
Introduction
 
Introduction
IntroductionIntroduction
Introduction
 
Introduction
IntroductionIntroduction
Introduction
 

Último

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 

Último (20)

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 

Tutorial Knowledge Discovery

  • 1. Knowledge Discovery for the Semantic Web under the Data Mining Perspective Claudia d'Amato Department of Computer Science University of Bari Italy
  • 2. Knowledge Disovery: Definition Knowledge Discovery (KD) “the process of automatically searching large volumes of data for patterns that can be considered knowledge about the data” [Fay'96] Patterns need to be:  New – Hidden in the data  Useful  Understandable
  • 3. What is a Pattern and Knowldge? Pattern expression E in a given language L describing a subset FE of facts F. E is called pattern if it is simpler than enumerating facts in FE Knowledge awareness or understanding of facts, information, descriptions, or skills, which is acquired through experience or education by perceiving, discovering, or learning
  • 4. Knowledge Discovery and Data Minig  KD is often related with Data Mining (DM) field  DM is one step of the "Knowledge Discovery in Databases" process (KDD)[Fay'96]  DM is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and databases.  DM goal: extracting information from a data set and transforming it into an understandable structure/representation for further use
  • 5. What is not DM Not all information discovery tasks are considered to be DM.  Looking up individual records using a DBMS  Finding particular Web pages via a query to a search engine Tasks releted to the Information Retrieval (IR) Nonetheless, DM techniques can been used to improve IR systems  e.g. to create index structures for efficiently organizing and retrieving information
  • 6. The KDD process Input Data Data Preprocessing and Transformation Data Mining Interpretation and Evaluation Information/ Taking Action Data fusion (multiple sources) Data Cleaning (noise,missing val.) Feature Selection Dimentionality Reduction Data Normalization The most labourous and time consuming step Filtering Patterns Visualization Statistical Analysis - Hypothesis testing - Attribute evaluation - Comparing learned models - Computing Confidence Intervals CRISP-DM (Cross Industry Standard Process for Data Mining) alternative process model developed by a consortium of several companies All data mining methods use induction-based learing The knowledge gained at the end of the process is given as a model/data generalization
  • 7. The KDD process Input Data Data Preprocessing and Transformation Data Mining Interpretation and Evaluation Information/ Taking Action Data fusion (multiple sources) Data Cleaning (noise,missing val.) Feature Selection Dimentionality Reduction Data Normalization The most labourous and time consuming step Filtering Patterns Visualization Statistical Analysis - Hypothesis testing - Attribute evaluation - Comparing learned models - Computing Confidence Intervals CRISP-DM (Cross Industry Standard Process for Data Mining) alternative process model developed by a consortium of several companies All data mining methods use induction-based learing The knowledge gained at the end of the process is given as a model/data generalization
  • 8. Data Mining Tasks...  Predictive Tasks: predict the value of a particular attribute (called target or dependent variable) based on the value of other attributes (called explanatory or independent variables) Goal: learning a model that minimizes the error between the predicted and the true values of the target variable  Classification → discrete target variables  Regression → continuous target variables
  • 9. ...Data Mining Tasks... Examples of Classification tasks  Develop a profile of a “successfull” person  Predict customers that will respond to a marketing compain Examples of Regression tasks  Forecasting the future price of a stock
  • 10. … Data Mining Tasks...  Descriptive tasks: discover patterns (correlations, clusters, trends, trajectories, anomalies) summarizing the underlying relationship in the data  Association Analysis: discovers (the most interesting) patterns describing strongly associated features in the data/relationships among variables  Cluster Analysis: discovers groups of closely related facts/observations. Facts belonging to the same cluster are more similar each other than observations belonging other clusters
  • 11. ...Data Mining Tasks... Examples of Association Analysis tasks  Market Basket Analysis  Discoverying interesting relationships among retail products. To be used for:  Arrange shelf or catalog items  Identify potential cross-marketing strategies/cross- selling opportunities Examples of Cluster Analysis tasks  Automaticaly grouping documents/web pages with respect to their main topic (e.g. sport, economy...)
  • 12. … Data Mining Tasks  Anomaly Detection: identifies facts/observations (Outlier/change/deviation detection) having characteristics significantly different from the rest of the data. A good anomaly detector has a high detection rate and a low false alarm rate. • Example: Determine if a credit card purchase is fraudolent → Imbalance learning setting Approaches:  Supervised: build models by using input attributes to predict output attribute values  Unsupervised: build models/patterns without having any output attributes
  • 13. The KDD process Input Data Data Preprocessing and Transformation Data Mining Interpretation and Evaluation Information/ Taking Action Data fusion (multiple sources) Data Cleaning (noise,missing val.) Feature Selection Dimentionality Reduction Data Normalization The most labourous and time consuming step Filtering Patterns Visualization Statistical Analysis - Hypothesis testing - Attribute evaluation - Comparing learned models - Computing Confidence Intervals CRISP-DM (Cross Industry Standard Process for Data Mining) alternative process model developed by a consortium of several companies All data mining methods use induction-based learing The knowledge gained at the end of the process is given as a model/ data generalization
  • 14. A closer look at the Evalaution step Given  DM task (i.e. Classification, clustering etc.)  A particular problem for the chosen task Several DM algorithms can be used to solve the problem 1) How to assess the performance of an algorithm? 2) How to compare the performance of different algorithms solving the same problem?
  • 15. Evaluating the Performance of an Algorithm
  • 16. Assessing Algorithm Performances Components for supervised learning [Roiger'03] Test data missing in unsupervised setting Instances Attributes Data Training Data Test Data Model Builder Supervised Model Evaluation Parameters Performance Measure (Task Dependent) Examples of Performace Measures  Classification → Predictive Accuracy  Regression → Mean Squared Error (MSE)  Clustering → Cohesion Index  Association Analysis → Rule Confidence  ….....
  • 17. Supervised Setting: Building Training and Test Set Necessary to predict performance bounds based with whatever data (independent test set)  Split data into training and test set  The repeated and stratified k-fold cross-validation is the most widly used technique  Leave-one-out or bootstrap used for small datasets  Make a model on the training set and evaluate it out on the test set [Witten'11]  e.g. Compute predictive accuracy/error rate
  • 18. K-Fold Cross-validation (CV)  First step: split data into k subsets of equal size  Second step: use each subset in turn for testing, the remainder for training  Subsets often stratified → reduces variance  Error estimates averaged to yield the overall error estimate  Even better: repeated stratified cross-validation  E.g. 10-fold cross-validation is repeated 15 times and results are averaged → reduces the variance Test set step 1 Test set step 2 …..........
  • 19. Leave-One-Out cross-validation  Leave-One-Out: a particular form of cross-validation:  Set number of folds to number of training instances  I.e., for n training instances, build classifier n times  The results of all n judgement are averaged for determining the final error estimate  Makes best use of the data for training  Involves no random subsampling  There's no point in repeating it → the same result will be obtained each time
  • 20. The bootstrap  CV uses sampling without replacement  The same instance, once selected, cannot be selected again for a particular training/test set  Bootstrap uses sampling with replacement  Sample a dataset of n instances n times with replacement to form a new dataset  Use this new dataset as the training set  Use the remaining instances not occurting in the training set for testing  Also called the 0.632 bootstrap → The training data will contain approximately 63.2% of the total instances
  • 21. Estimating error with the bootstrap The error estimate of the true error on the test data will be very pessimistic  Trained on just ~63% of the instances  Therefore, combine it with the resubstitution error:  The resubstitution error (error on training data) gets less weight than the error on the test data  Repeat the bootstrap procedure several times with different replacement samples; average the results
  • 23. Comparing Algorithms Performance Frequent question: which of two learning algorithms performs better? Note: this is domain dependent! Obvious way: compare the error rates computed by the use of k-fold CV estimates Problem: variance in estimate on a single 10-fold CV Variance can be reduced using repeated CV However, we still don’t know whether the results are reliable
  • 24. Significance tests  Significance tests tell how confident we can be that there really is a difference between the two learning algorithms  Statistical hypothesis test exploited → used for testing a statistical hypothesis  Null hypothesis: there is no significant (“real”) difference (between the algorithms)  Alternative hypothesis: there is a difference  Measures how much evidence there is in favor of rejecting the null hypothesis for a specified level of significance – Compare two learning algorithms by comparing e.g. the average error rate over several cross- validations (see [Witten'11] for details)
  • 25. DM methods and SW: A closer Look
  • 26. DM methods and SW: a closer look  Classical DM algorithms originally developed for propositional representations  Some upgrades to (multi-)relational and graph representations defined Semantic Web: characterized by  Rich/expressive representations (RDFS, OWL) – How to cope with them when applying DM algorithms?  Open world Assumpion (OWA) – DM algorithms grounded on CWA – Are metrics for classical DM tasks still applicable?
  • 27. Exploiting DM methods in SW...  Approximate inductive instance retrieval  assess the class membership of the individuals in a KB w.r.t. a query concept [Fanizzi'12]  (Hyerarchical) Type prediciton – Assess the type of instances in RDF datasets [Melo'14]  Link Prediction  Given an individual and a role R, predict the other individuals a is in R relation with [Minervini'14] Regarded as a classification problem → (semi-)automatic ontology population
  • 28. ...Exploiting DM mthods in SW...  Automatic concept drift and novelty detection [Fanizzi'09]  change of a concept towards a more general/specific one w.r.t. the evidence provided by new annotated individuals  Ex.: almost all Worker work for more than 10 hours per days → HardWorker  isolated cluster may require to be defined through new emerging concepts to be added to ontology  Ex.: subset of Worker employed in a company → Employee  Ex.: subset of Worker working for several companies → Free-lance Regarded as a (conceptual) clustering problem
  • 29. ...Exploiting DM mthods in SW  Semi-automatic ontology enrichment [d'Amato'10,Völker'11, Völker'15,d'Amato'16]  exploiting the evidence coming from the data → discovering hidden knowledge patterns in the form of relational association rules  new axioms may be suggested → existing ontologies can be extended Regarded as a pattern discovery problem
  • 30. Intro to the Research Task
  • 31. Associative Analysis: the Pattern Discovery Task Problem Definition: Given a dataset find  all possible hidden pattern in the form of Association Rule (AR)  having support and confidence greater than a minimum thresholds Definition: An AR is an implication expression of the form X → Y where X and Y are disjoint itemsets An AR expresses a co-occurrence relationship between the items in the antecedent and the concequence not a causality relationship
  • 32. Basic Definitions  An itemset is a finite set of assignments of the form {A1 = a1 , …, Am = am } where Ai are attributes of the dataset and ai the corresponding values  The support of an itemset is the number of istances/tuples in the dataset containing it. Similarily, support of a rule is s(X → Y ) = |(X  Y)|;  The confidence of a rule provides how frequently items in the consequence appear in instances/tuples containing the antencedent c(X → Y ) = |(X  Y)| / |(X)| (seen as p(Y|X) )
  • 33. Discoverying Association Rules: General Approach Articulated in two main steps [Agrawal'93, Tan'06]: 1. Frequent Patterns Generation/Discovery (generally in the form of itemsets) wrt a minimum frequency (support) threshold  Apriori algortihm → The most well known algorithm  the most expensive computation; 2. Rule Generation  Extraction of all the high-confidence association rules from the discovered frequent patterns.
  • 34. Apriori Algortihm: Key Aspects  Uses a level-wise generate-and-test approach  Grounded on the non-monotonic property of the support of an itemset  The support of an itemset never exceeds the support of its subsets  Basic principle:  if an itemset is frequent → all its subsets must also be frequent  If an itemset is infrequent → all its supersets must be infrequent too  Allow to sensibly cut the search space
  • 35. Apriori Algorithm in a Nutshell Goal: Finding the frequent itemsets ↔ the sets of items that satisfying the min support threshold Iteratively find frequent itemsets with lenght from 1 to k (k-itemset) Given a set Lk-1 of frequent (k-1)itemset, join Lk-1 with itself to obain Lk the candidate k-itemsets Prune items in Lk that are not frequent (Apriori principle) If Lk is not empty, generate the next candidate (k+1)itemset until the frequent itemset is empty
  • 36. Apriori Algorithm: Example... Suppose having the transaction table (Boolean values considered for simplicity) Apply APRIORI algorithm ID List of Items T1 {I1,I2,I5} T2 {I2,I4} T3 {I2,I3} T4 {I1,I2,I4} T5 {I1,I3} T6 {I2,I3} T7 {I1,I3} T8 {I1,I2,I3,I5} T9 {I1,I2,I3}
  • 37. ...Apriori Algorithm: Example... Itemset Sup. Count {I1} 6 {I2} 7 {I3} 6 {I4} 2 {I5} 2 Itemset Sup. Count {I1} 6 {I2} 7 {I3} 6 {I4} 2 {I5} 2 Min. Supp. 2 Pruning L1 Itemset Sup. Count {I1,I2} 4 {I1,I3} 4 {I1,I4} 1 {I1,I5} 2 {I2,I3} 4 {I2,I4} 2 {I2,I5} 2 {I3,I4} 0 {I3,I5} 1 {I4,I5} 0 L2 Min. Supp. 2 Pruning Join for candidate generation Output After Pruning
  • 38. ...Apriori Algorithm: Example Itemset Prune Infrequent {I1,I2,I3} No {I1,I2,I5} No {I1,I2,I4} Yes {I1,I4} {I1,I3,I5} Yes {I3,I5} {I2,I3,I4} Yes {I3,I4} {I2,I3,I5} Yes {I3,I5} {I2,I4,I5} Yes {I4,I5}Output After Pruning L4 Min. Supp. 2 Pruning Join for candidate generation Itemset Sup. Count {I1,I2} 4 {I1,I3} 4 {I1,I5} 2 {I2,I3} 4 {I2,I4} 2 {I2,I5} 2 Apply Apriori principle Itemset Sup. Count {I1,I2,I3} 2 {I1,I2,I5} 2 Join for candidate generation L3 Output After Pruning Itemset Prune Infrequent {I1,I2,I3,I5} Yes {I3,I5} Empty Set STOP
  • 39. Generating ARs from frequent itemsets  For each frequent itemset “I” – generate all non-empty subsets S of I  For every non empty subset S of I – compute the rule r := “S → (I-S)”  If conf(r) > = min confidence – then output r
  • 40. Genrating ARs: Example... Given: L = { {I1}, {I2}, {I3}, {I4}, {I5}, {I1,I2}, {I1,I3}, {I1,I5}, {I2,I3}, {I2,I4}, {I2,I5}, {I1,I2,I3}, {I1,I2,I5} }. Let us fix 70% for the Minimum confidence threshold  Take l = {I1,I2,I5}.  All nonempty subsets are {I1,I2}, {I1,I5}, {I2,I5}, {I1}, {I2}, {I5}. The resulting ARs and their confidence are:  R1: I1 AND I2 →I5 Conf(R1) = supp{I1,I2,I5}/supp{I1,I2} = 2/4 = 50% REJECTED
  • 41. ...Generating ARs: Example... Min. Conf. Threshold 70%; l = {I1,I2,I5}.  All nonempty subsets are {I1,I2}, {I1,I5}, {I2,I5}, {I1}, {I2}, {I5}. The resulting ARs and their confidence are:  R2: I1 AND I5 →I2 Conf(R2) = supp{I1,I2,I5}/supp{I1,I5} = 2/2 = 100% RETURNED  R3: I2 AND I5 → I1 Conf(R3) = supp{I1,I2,I5}/supp{I2,I5} = 2/2 = 100% RETURNED  R4: I1 → I2 AND I5 Conf(R4) = sc{I1,I2,I5}/sc{I1} = 2/6 = 33% REJECTED
  • 42. ...Genrating ARs: Example Min. Conf. Threshold 70%; l = {I1,I2,I5}.  All nonempty subsets: {I1,I2}, {I1,I5}, {I2,I5}, {I1}, {I2}, {I5}. The resulting ARs and their confidence are:  R5: I2 → I1 AND I5 Conf(R5) = sc{I1,I2,I5}/sc{I2} = 2/7 = 29% REJECTED  R6: I5 → I1 AND I2 Conf(R6) = sc{I1,I2,I5}/ {I5} = 2/2 = 100% RETURNED Similarily for the other sets I in L (Note: it does not make sense to consider an itemset made by just one element i.e. {I1} )
  • 43. Identifying Representative Itemsets When the number of discovered frequent itemsets is very high, it could be useful to identify a representative set of itemsets from which all other patters may be derived  Maximal Frequent Itemset: is a frequent itemset for which none of its immediate supersets are frequent  Closed Frequent Itemset: is a frequent itemset for which none of its immediate supersets has exactly its same support count → used for removing redundant rules
  • 44. On improving Discovery of ARs Apriori algorithm may degrade significantly for dense datasets Alternative solutions:  FP-growth algorithm outperforms Apriori  Does not use the generate-and-test approach  Encodes the dataset in a compact data structure (FP- Tree) and extract frequent itemsets directly from it  Usage of additional interenstingness metrics (besides support and confidence) (see [Tan'06])  Lift, Interest Factor, correlation, IS Measure
  • 45. Frequent Graph Patterns  Frequent Graph Patterns are subgraphs that are found from a collection of graphs or a single massive graph, with a frequency no less than a specifed support threshold – Exploited for facilitating indexing and query processing  A graph g is a subgraph of another graph g' if there exists a subgraph isomorphism from g to g' denoted by g  g'. g' is called a supergraph of g
  • 46. Discoverying Frequent Graph Patterns Apriori-based and pattern-growth appraoches have been formally defined Problems:  Giving suitable definitions for support and conficence for frequent subgraph mining problem  Even more complicate for the case of a single large graph [see Aggarwal'10 sect. 2.5]  Developed appraoches revealed infeasible in practice
  • 47. Graph Mining: Current Approaches Methods for mining [Aggarwal'10, ch. 3, 4]:  Significant (optimal) subgraphs according to an objective function  In a timely way by accessing only a small subset of promising subgraphs  Representative (orthogonal) subgraphs by exploiting a notion of similarity Avoid generating the complete set of frequent subgraphs while presenting only a set of interesting subgraph patterns
  • 48. Pattern Discovery on RDF data sets for Making Predictions Proposed frameworks for discoverying ARs from: ● RDF data sets [Galarraga'13, Galarraga'15] ➢ Inspired to ILP appraoches for discovering ARs from clausal representation ➢ Exploits discovered ARs for making new role preditions ➢ Takes into account the underlying OWA ➢ Proposes new metrics for evaluating the prediction results considering the OWA ● Populated Ontological knowldge bases [d'Amato'16] ● Exploits the available background knowledge ● Exploits deductive reasoning capabilities ● Discovered ARs can make concept and role predictions
  • 49. Research Task: Goal Moving from [d'Amato'16], [Galarraga'15] define a method for Discoverying ARs from ontological KBs that  Makes additional usage of the available ontological knowledge and its underlying semantics (e.g. Using hierarchy of roles)  Takes advantage of solutions for improving the performances of the method with respect to scalability. Possible directions: – Heuristics for further cutting the search space – Indexing methods for caching the results of the inferences made by the reasoner apply the formalized method to a knowledge graph generated as output of the first part of the talk
  • 50. Research Task: Possible Research Questions... ● Can the formalized method be applied straightforwardly to a knowledge graph generated as output of the first part of the talk? Is there any gap that needs to be filled? If so, what is such a gap? ● Is OWA the right way to go? In case you move towards CWA, what is the impact of such a choice in your method and its evaluation? ● Is the exploitation of a reasoner and a background knowledge a value added or a bottleneck?
  • 51. ...Research Task: Possible Research Questions ● Are the metrics proposed in the referenced papers enough? If not: ➢ what are the aspects/effects/outputs that need to be evalauted differently/further? ➢ what are the new/additional metrics that are necessary? ● Is there any additional utility of the discovered rules ● What is the chosen language for representing the discovered rules? What is/are the motivation/s for it?
  • 52. Research Task: Expected Output 4 Mins Presentation summarizing  [Galarraga'15/d'Amato'16] approach(es) – only one group → 4 additional mins presentation – Randomly decided when working on the research task  Proposed solution and its value added/advace with respect to the references above  Replies to the proposed/new research questions, if any.  How do you plan to prove the value added of your proposal
  • 53. Research Task: Groups Formation and Rules ● Groups have to be composed of 7 members ● Each group should has the following chracterstics ● Members are different from the group for the mini-project ● An overlap of maximum 2 members with respect to the mini-project groups is allowed ● Students that already know each other from preivous experineces should belong to different groups DON'T BE SHY THERE ARE NO RIGHT OR WRONG ANSWERS THIS IS THE TIME TO LEARN FROM INTERACTION BE CREATIVE AND DON'T LIMIT YOURSELF
  • 54. References...  [Fay'96] U. Fayyad, G. Piatetsky-Shapiro, P. Smyth. From Data Mining to Knowledge Discovery: An Overview. Advances in Knowledge Discovery and Data Mining, MIT Press, 1996.  [Agrawal'93] R. Agrawal, T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. Proc. of Int. Conf. on Management of Data, p. 207–216. ACM, 1993  [d'Amato'10] C. d'Amato, N. Fanizzi, F. Esposito: Inductive learning for the Semantic Web: What does it buy? Semantic Web 1(1-2): 53-59 (2010)  [Völker'11] J. Völker, M. Niepert: Statistical Schema Induction. ESWC (1) 2011: 124-138
  • 55. ...References... ● [Völker'15] J. Völker, D. Fleischhacker, H. Stuckenschmidt: Automatic acquisition of class disjointness. J. Web Sem. 35: 124- 139 (2015)  [d'Amato'16] C. d'Amato, S. Staab, A.G.B. Tettamanzi, T. Minh, F.L. Gandon. Ontology enrichment by discovering multi-relational association rules from ontological knowledge bases. SAC 2016: 333-338  [Tan'06] P.N. Tan, M. Steinbach, V. Kumar. Introduction to Data Mining. Ch. 6 Pearson, 2006 . http://www- users.cs.umn.edu/~kumar/dmbook/ch6.pdf  [Aggarwal'10] C. Aggarwal, H. Wang. Managing and Mining Graph Data. Springer, 2010
  • 56. ...References... ● [Witten'11] I.H. Witten, E. Frank. Data Mining: Practical Machine Learning Tool and Techiques with Java Implementations. Ch. 5. Morgan-Kaufmann, 2011 (3rd Edition) ● [Fanizzi'12] N. Fanizzi, C. d'Amato, F. Esposito: Induction of robust classifiers for web ontologies through kernel machines. J. Web Sem. 11: 1-13 (2012) ● [Minervini'14] P. Minervini, C. d'Amato, N. Fanizzi, F. Esposito: Adaptive Knowledge Propagation in Web Ontologies. Proc. of EKAW Conferece. Springer. pp. 304-319, 2014.  [Roiger'03] R.J. Roiger, M.W. Geatz. Data Mining. A Tutorial- Based Primer. Addison Wesley, 2003
  • 57. ...References ● [Melo'16] A. Melo, H. Paulheim, J. Völker. Type Prediction in RDF Knowledge Bases Using Hierarchical Multilabel Classification. WIMS 2016: 14 ● [Galárraga'13] L. Galárraga, C. Teflioudi, F. Suchanek, K. Hose. AMIE: Association Rule Mining under Incomplete Evidence in Ontological Knowledge Bases. Proc. of WWW 2013. http://luisgalarraga.de/docs/amie.pdf  [Fanizzi'09] N. Fanizzi, C. d'Amato, F. Esposito: Metric-based stochastic conceptual clustering for ontologies. Inf. Syst. 34(8): 792-806, 2009  [Galárraga'15] L. Galárraga, C. Teflioudi, F. Suchanek, K. Hose. Fast Rule Mining in Ontological Knowledge Bases with AMIE+. VLDB Journal 2015. http://suchanek.name/work/publications/vldbj2015.pdf