SlideShare una empresa de Scribd logo
1 de 32
Descargar para leer sin conexión
IAIK
Semantic Pattern
Transformation
IKNOW 2013
Peter Teufl, Herbert Leitold, Reinhard Posch
peter.teufl@iaik.tugraz.at
IAIK
Our Background
Topics
Mobile device security
Cloud security
Security consulting for public insititutions
(Austria)
IT security research
IT security lectures
e-Government
A-SIT
IAIK
Why does he talk about Knowledge Discovery?
How does IT security relate to knowledge discovery?
eGov - eParticipation: document analysis, twitter etc.
intrusion detection systems (network traffic analysis)
malware detection (network traffic, mobile phones)
mobile application analysis (metadata, market descriptions)
mobile application security (hot topic, BYOD, etc.)
IAIK
What to expect?
Motivation for the Semantic Pattern Transformation
Basic concepts, techniques
How does it work? Evaluation?
Applications, results, current topics!
IAIK
Environment
Arbitrary features
No apriori knowledge
Heteregenous domains
Clustering
Supervised learning
Anomaly Detection
Semantic search
Visualization
Extracting knowledge
Text analysis
Android market descriptions
histograms
flexible
deployment
new
domains
terms
numbers
IAIK
Process...
•Different processing steps
•From defining the goals
•To extracting the desired
knowledge
•Machine learning algorithms are
often used within KDD
•However, the complete machine
learning process is quite similar
to KDD
Knowledge discovery
goals
Target data set
Preprocessing
Data extraction
Data mining method
Data mining
algorithm
Knowledge extraction
Data mining
Knowledge processing
Fayyad et al. Machine learning
Domain-specific data set
KDT
Machine learning
goals
Instance extraction
Feature selection,
construction
Instance selection
Machine learning
algorithm
Preprocessing
Algorithm application
Interpretation
ML-KDT
IAIK
ADAPTATION COMPLEXITY?
•Assuming an arbitrary data-set (e-Participation,
Android Market applications)
•Further assuming: a knowledge discovery goal: e.g.,
unsupervised clustering
•Then: we need to adapt the steps on the left
•And: We need to adapt this setup when the data
changes, even when the knowledge discovery goals
remain the same!
•Android Market applications vs. text documents vs.
network traffic vs. malware detection?
Domain-specific data
set
Machine learning
goals
Instance extraction
Feature selection,
construction
Instance selection
Algorithm selection
Preprocessing
Algorithm application
Interpretation
Machine Learning
High
Dependence on domain data and goals
Medium Low
IAIK
TOWARDS A SEMANTIC REPRESENTATION
•Finding a new representation...
•New representation is called Semantic Patterns
•Key properties:
•Still a vector representation (compatible to old representation)
•Not the feature values themselves, but their semantic relations are represented
•All values have the same meaning and feature type (activation)
•Transformation from raw data into Semantic Patterns:
Semantic Pattern Transformation
IAIK
SEMANTIC PATTERN TRANSFORMATION
•The Semantic Pattern Transformation is arranged
in five layers
•Layer 1 - Feature extraction
•Layer 2 - Associative network - Node generation
•Layer 3 - Associative network - Link generation
•Layer 4 - Spreading activation (SA)
•Layer 5 - Analysis (machine learning, semantic
search etc.)
Data set
Relation
FROM TO TIME
FROM TO TIME
FROM TO TIME SF 2
Instance SF 1 DF 1 DF 2SF 2
SV
MV
SV
SV
SV
MV
SV
MV
MV
P 1
P 3 P 4
P 2
Supervised
learning
Unsupervised
clustering
Semantic
relations
Feature value
relevance
Anomaly detection
Semantic
development over
time
Pattern similarity
Layer 1
Feature Extraction
Layer 2 - 3
Associative Network
Generation
Layer 4
Spreading Activation
Layer 5
Analysis
SF 2
Instances
Map
Map
Map
IAIK
SPT: Layer 1 - Feature extraction
Extract features, their values and determine the type
(categorical, distance-based)
Categorical: Exports
Distance-based: Unemployment rate, fertility rate
Country Exports Unemployment rate Fertility rate
C1 coffee 20% 5
C2 cacao 20% 5
C3 coffee, cacao 20% 5
C4 machinery 5% 2
C5 chemicals 5% 2
C6 chemicals, machinery 5% 2
C7 chemicals, cacao 20% missing data
C8 missing data 20% 5
C9 coffee, cacao missing data missing data
IAIK
SPT: Layer 2 - Node generation
20%
5%
coffee
cocoa
machinery
chemicals
5
2
Country Exports Unemployment rate Fertility rate
C1 coffee 20% 5
C2 cacao 20% 5
C3 coffee, cacao 20% 5
C4 machinery 5% 2
C5 chemicals 5% 2
C6 chemicals, machinery 5% 2
C7 chemicals, cacao 20% missing data
C8 missing data 20% 5
C9 coffee, cacao missing data missing data
Categorical feature
values:
one node for each
value
Distance-based feature values:
map value ranges to single nodes
Associative network
IAIK
SPT: Layer 3 - Link generation
0.25
0.75
0.5
Link Weight
1.00
20%
5
5%
coffee
cocoa
machinery
chemicals
2
Country Exports Unemployment rate Fertility rate
C1 coffee 20% 5
C2 cacao 20% 5
C3 coffee, cacao 20% 5
C4 machinery 5% 2
C5 chemicals 5% 2
C6 chemicals, machinery 5% 2
C7 chemicals, cacao 20% missing data
C8 missing data 20% 5
C9 coffee, cacao missing data missing data
coffee, 20%, 5
chemicals, cacao, 20%
IAIK
SPT: Layer 4 - Spreading activation
Creating a Semantic Pattern: in this case for “coffee” and “cacao”
Set activation value of the two nodes to 1.0
Spread this activation value to neighboring nodes via the weighted links
20%
5
5%
coffee
cocoa
machinery
chemicals
2
1.0
1.0
IAIK
SPT: Layer 4 - Spreading activation
Typically, one would create Semantic Patterns for all instances within the data
set
E.g. a pattern for C1 by activating coffee, 20% and 5
However, we can also create patterns for feature values: e.g. “coffee”
Country Exports Unemployment rate Fertility rate
C1 coffee 20% 5
C2 cacao 20% 5
C3 coffee, cacao 20% 5
C4 machinery 5% 2
C5 chemicals 5% 2
C6 chemicals, machinery 5% 2
C7 chemicals, cacao 20% missing data
C8 missing data 20% 5
C9 coffee, cacao missing data missing data
IAIK
SPT: Layer 4 - Spreading activation
After SA: each node
in the network has
an activation value
By representing the
nodes and their
activation values as
a vector, we gain
a Semantic Pattern coffee cocoa machinery chemicals 20% 5% 5 2
0.00 0.08 0.38 0.300.00 0.001.151.15
cocoa
1.15
coffee
1.15
20%
0.38
5
0.30
chemicals
0.08
2
0.00
5%
0.00
machinery
0.00
IAIK
0
0.25
0.50
coffee cacao machinery chemicals 20% 5% 5 2
Export: Cacao
Unsorted Semantic Pattern
0
0.25
0.50
coffee cacao machinery chemicals 20% 5% 5 2
Export: Coffee
Unsorted Semantic Pattern
0
0.25
0.50
coffee cacao machinery chemicals 20% 5% 5 2
Fertility: 2
Unsorted Semantic Pattern
Country Exports Unemployment rate Fertility rate
C1 coffee 20% 5
C2 cacao 20% 5
C3 coffee, cacao 20% 5
C4 machinery 5% 2
C5 chemicals 5% 2
C6 chemicals, machinery 5% 2
C7 chemicals, cacao 20% missing data
C8 missing data 20% 5
C9 coffee, cacao missing data missing data
Each feature value is
represented by a semantic
fingerprint
Allows for an instant analysis of
semantic relations to other
feature values
Sort, mean, variance, adding,
subtracting
IAIK
SPT: Layer 5 - Analysis
Calculating the
distance between two
patterns (Euclidean
distance, Cosine
similarity)
For unsupervised
clustering, semantic-
aware search
algorithms
Keyword search for coffeeKeyword search for coffeeKeyword search for coffeeKeyword search for coffee
C1 coffee 20% 5
C3 coffee, cacao 20% 5
C9 coffee, cacao missing data missing data
Semantic aware search for coffeeSemantic aware search for coffeeSemantic aware search for coffeeSemantic aware search for coffee
C9 coffee, cacao missing data missing data
C1 coffee 20% 5
C3 coffee, cacao 20% 5
C2 cacao 20% 5
C8 missing data 20% 5
C7 chemicals, cacao 20% missing data
C5 chemicals 5% 2
C6 chemicals, machinery 5% 2
C4 machinery 5% 2
IAIK
SPT: Layer 5 - Analysis
Machine learning: apply any machine learning algorithm to the Semantic
Patterns
Unsupervised clustering
Supervised learning
Semantic-aware search
Knowledge discovery: semantic relations, arbitrary procedures: mean,
variance etc.
Anomaly detection, feature relevance, simple operations (variance, mean,
etc.)
Visualization
IAIK
Benefits?
Domain-specific data
set
Machine learning
goals
Instance extraction
Feature selection,
construction
Instance selection
Algorithm selection
Preprocessing
Algorithm application
Interpretation
Machine Learning
Domain-specific data
set
Machine learning
goals
Instance extraction
Feature selection,
construction
Instance selection
Algorithm selection
Preprocessing
Algorithm application
Interpretation
High
Dependence on domain data and goals
Medium Low
Application in heterogeneous domains
regardless of the nature of the data
Except for Layer 1, we do not need any
manual setup for the layers
Regardless of the analyzed data, the
Semantic Patterns always use the same
model
This means: Regardless of the deployed
knowledge discovery method, we can
always use the same methods for
knowledge extraction!
IAIK
Comparing
the two models
Country Coffee Cacao Machinery Chemicals 20% 5% 5 2
C1 1.30 0.53 0.00 0.08 1.45 0.00 1.45 0.00
C2 0.45 1.38 0.00 0.15 1.53 0.00 1.45 0.00
C3 1.45 1.53 0.00 0.15 1.68 0.00 1.60 0.00
C4 0.00 0.00 1.30 0.38 0.00 1.38 0.00 1.38
C5 0.00 0.08 0.38 1.30 0.08 1.38 0.00 1.38
C6 0.00 0.08 1.37 1.37 0.08 1.53 0.00 1.53
C7 0.30 1.30 0.08 1.15 1.30 0.15 0.45 0.15
C8 0.30 0.38 0.00 0.08 1.30 0.00 1.30 0.00
C9 1.15 1.15 0.00 0.08 0.38 0.00 0.30 0.00
0
0.75
1.50
coffee cacao machinery chemicals 20% 5% 5 2
Mean pattern: C4, C5, C6
Unsorted Semantic Pattern
0
1.00
2.00
coffee cacao machinery chemicals 20% 5% 5 2
Mean pattern: C1, C2, C3
Unsorted Semantic Pattern
Country Coffee Cacao Machinery Chemicals Unemployment rate Fertility rate
C1 1 0 0 0 20% 5
C2 0 1 0 0 20% 5
C3 1 1 0 0 20% 5
C4 0 0 1 0 5% 2
C5 0 0 0 1 5% 2
C6 0 0 1 1 5% 2
C7 0 1 0 1 20% missing data
C8 missing datamissing datamissing datamissing data 20% 5
C9 1 1 0 0 missing data missing data
Same model: Android application, a
country or a document... the activation
values always have the same meaning
Semantic Patterns
Value-centric feature vectors
IAIK
Evaluation
26 data sets from
the UCI machine
learning repository
Supervised: SVM
Unsupervised: EM
and k-Means
Application to raw
data and to
Semantic Patterns
Data set Label Inst DF SF Classes SVM (N) SVM (NN) SVM (P) KM (N) KM (NN) KM (P) EM (NN) EM (P)
Breast Cancer BC
Dermatology DE
KR vs. KP KR
Lymph LY
Mushroom MU
Soybean SO
Splice SP
Vote VO
Zoo ZO
Anneal AN
Colic CO
Credit-A CA
Credit-G CG
Heart-C HC
Heart-H HH
Hepatitis HE
Breast-w BW
Diabetes DI
Glass GL
Heart-Statlog HS
Ionosphere IO
Iris IR
Segment SE
Sonar SO
Vehicle VE
Vowel VO
SVMSVMSVM K-MeansK-MeansK-Means EMEM
SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2
CategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategorical
286 9 2 0.03 0.04 0.04 0.01 0.01 0.06 0.00 0.08
366 1 33 6 0.93 0.92 0.95 0.58 0.09 0.86 0.87 0.87
3196 36 2 0.75 0.75 0.72 0.00 0.01 0.00 0.04 0.00
148 18 4 0.53 0.51 0.48 0.13 0.18 0.25 0.26 0.27
8124 22 2 1.00 1.00 1.00 0.48 0.47 0.45 0.61 0.59
683 35 19 0.92 0.92 0.93 0.59 0.62 0.73 0.79 0.79
3190 60 3 0.71 0.72 0.80 0.03 0.03 0.44 0.41 0.31
435 16 2 0.76 0.74 0.67 0.47 0.48 0.47 0.49 0.45
101 17 7 0.94 0.94 0.97 0.78 0.78 0.82 0.82 0.85
TotalTotalTotalTotal 0.73 0.73 0.73 0.34 0.30 0.45 0.48 0.47
MixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixed
898 6 32 6 0.86 0.86 0.92 0.23 0.03 0.30 0.31 0.32
368 7 15 2 0.31 0.32 0.31 0.13 0.03 0.05 0.10 0.12
689 6 9 2 0.41 0.41 0.39 0.16 0.02 0.25 0.17 0.21
1000 7 13 2 0.11 0.10 0.12 0.01 0.01 0.00 0.01 0.02
303 6 7 5 0.36 0.36 0.29 0.24 0.01 0.36 0.31 0.28
294 6 7 5 0.32 0.31 0.33 0.27 0.01 0.32 0.28 0.25
155 5 14 2 0.25 0.28 0.21 0.13 0.00 0.21 0.22 0.24
TotalTotalTotalTotal 0.37 0.38 0.37 0.17 0.02 0.21 0.20 0.20
NumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumerical
699 9 2 0.78 0.78 0.77 0.73 0.74 0.82 0.72 0.58
768 8 2 0.18 0.18 0.15 0.05 0.03 0.10 0.10 0.08
214 9 7 0.30 0.30 0.50 0.34 0.39 0.33 0.37 0.36
270 13 2 0.36 0.36 0.37 0.25 0.02 0.39 0.29 0.27
351 34 2 0.48 0.48 0.50 0.12 0.12 0.16 0.25 0.25
150 4 3 0.87 0.87 0.87 0.71 0.71 0.75 0.81 0.78
2310 19 7 0.88 0.88 0.90 0.61 0.53 0.59 0.62 0.60
208 60 2 0.23 0.23 0.23 0.01 0.01 0.02 0.01 0.01
846 18 4 0.51 0.51 0.48 0.11 0.19 0.19 0.10 0.19
990 10 3 11 0.63 0.63 0.76 0.06 0.34 0.23 0.19 0.25
TotalTotalTotalTotal 0.52 0.52 0.55 0.30 0.31 0.36 0.35 0.34
IAIK
•Applications described in several publications, which analyze
•e-Participation (Egyptian revolution, Fukoshima, Mitmachen): text documents
•Intrusion detection: event correlation
•RDF data analysis (semantic web)
•WiFi privacy (analyzing captured emails)
•Android Market application analysis
DOES IT WORK?
IAIK
Current Project
Android application security
Container applications for BYOD (require encryption, secure
communication, key derivation functions, root checks etc.)
Manual analysis is cumbersome
Semantic Patterns
Extract Dalvik VM code, features (opcodes, methods, local variables etc.)
Apply Semantic Patterns technique
Clustering, supervised learning, anomaly detection etc.
IAIK
Current Project
IAIK
Current Project
Also works directly on the
phone...
Detecting SMS catchers/sniffers
More fine grained detection
assymmetric cryptography
symmetric cryptography
IAIK
Outlook
Publish the Java API...
basically a converter from arbitrary feature vectors to
Semantic Patterns (e.g. in/out in ARFF format)
Deep learning...
IAIK
Thx!
IAIK
IAIK
K-Means
Par
K-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-Means EMEMEMEMEMEMEMEMEMEM
Total BC DE KR LY MU SO SP VO ZO Total BC DE KR LY MU SO SP VO ZO
N
NN
D 0.0
D 0.1
D 0.3
D 0.5
D 0.7
D 0.1
D 0.3
D 0.5
D 0.7
D 0.1
D 0.3
D 0.5
D 0.7
D 0.1
D 0.3
D 0.5
D 0.7
Raw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw Data
0.341 0.012 0.584 0.004 0.131 0.475 0.587 0.031 0.467 0.782 Not availableNot availableNot availableNot availableNot availableNot availableNot availableNot availableNot availableNot available
0.296 0.007 0.094 0.010 0.176 0.472 0.616 0.030 0.476 0.783 0.477 0.002 0.871 0.036 0.258 0.610 0.789 0.410 0.494 0.822
Semantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic Patterns
0.443 0.025 0.849 0.003 0.199 0.413 0.728 0.465 0.493 0.814 0.449 0.004 0.767 0.001 0.222 0.590 0.740 0.423 0.489 0.801
Comb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=L
0.442 0.029 0.811 0.004 0.245 0.545 0.726 0.387 0.476 0.759 0.441 0.074 0.885 0.000 0.271 0.615 0.786 0.004 0.505 0.826
0.447 0.068 0.846 0.004 0.241 0.482 0.724 0.424 0.476 0.758 0.460 0.079 0.875 0.001 0.258 0.592 0.788 0.250 0.449 0.846
0.452 0.061 0.856 0.000 0.245 0.448 0.733 0.437 0.467 0.820 0.468 0.079 0.874 0.001 0.265 0.592 0.789 0.306 0.452 0.850
0.422 0.069 0.826 0.000 0.209 0.275 0.728 0.419 0.463 0.804 0.465 0.079 0.874 0.001 0.252 0.579 0.799 0.312 0.445 0.847
Comb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=L
0.441 0.056 0.853 0.000 0.244 0.453 0.733 0.399 0.476 0.759 0.433 0.079 0.872 0.001 0.270 0.572 0.794 0.001 0.476 0.829
0.434 0.075 0.820 0.000 0.228 0.411 0.718 0.431 0.472 0.750 0.466 0.079 0.881 0.001 0.280 0.592 0.802 0.298 0.437 0.828
0.439 0.060 0.792 0.000 0.235 0.416 0.741 0.405 0.463 0.836 0.466 0.079 0.871 0.001 0.251 0.581 0.805 0.310 0.445 0.848
0.422 0.067 0.798 0.000 0.224 0.364 0.726 0.376 0.462 0.782 0.462 0.087 0.875 0.001 0.254 0.580 0.776 0.292 0.445 0.845
Comb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=S
0.418 0.029 0.790 0.006 0.236 0.311 0.705 0.449 0.496 0.742 0.472 0.002 0.893 0.000 0.263 0.571 0.767 0.432 0.495 0.820
0.452 0.030 0.860 0.001 0.231 0.470 0.715 0.475 0.491 0.799 0.476 0.002 0.914 0.000 0.261 0.586 0.775 0.427 0.495 0.823
0.448 0.048 0.799 0.009 0.215 0.539 0.725 0.450 0.493 0.758 0.472 0.002 0.897 0.000 0.267 0.584 0.758 0.427 0.484 0.829
0.448 0.033 0.850 0.000 0.230 0.495 0.712 0.435 0.493 0.787 0.473 0.002 0.903 0.000 0.250 0.586 0.773 0.427 0.484 0.829
Comb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=S
0.439 0.029 0.806 0.009 0.250 0.435 0.727 0.439 0.494 0.760 0.475 0.002 0.903 0.000 0.254 0.576 0.764 0.429 0.495 0.852
0.420 0.015 0.775 0.004 0.210 0.436 0.717 0.409 0.443 0.774 0.474 0.002 0.901 0.000 0.271 0.584 0.763 0.427 0.484 0.837
0.429 0.030 0.789 0.009 0.226 0.410 0.716 0.448 0.485 0.749 0.476 0.002 0.904 0.000 0.255 0.586 0.767 0.427 0.484 0.854
0.438 0.040 0.839 0.006 0.246 0.418 0.726 0.409 0.480 0.775 0.480 0.002 0.910 0.000 0.269 0.615 0.771 0.431 0.494 0.825
IAIK
K-Means
Par
K-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-Means EMEMEMEMEMEMEMEM
Total AN CO CA CG HC HH HE Total AN CO CA CG HC HH HE
N
NN
σ 0.0
σ 0.2
σ 0.4
σ 0.6
σ 0.8
σ 0.0
σ 0.2
σ 0.4
σ 0.6
σ 0.8
σ 0.0
σ 0.2
σ 0.4
σ 0.6
σ 0.8
σ 0.0
σ 0.2
σ 0.4
σ 0.6
σ 0.8
σ 0.0
σ 0.2
σ 0.4
σ 0.6
σ 0.8
Raw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw Data
0.165 0.226 0.129 0.155 0.009 0.237 0.269 0.131 Not availableNot availableNot availableNot availableNot availableNot availableNot availableNot available
0.017 0.028 0.030 0.016 0.012 0.014 0.012 0.004 0.201 0.312 0.103 0.171 0.013 0.309 0.278 0.223
Semantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic Patterns
D=0.0 MDL=2.0D=0.0 MDL=2.0D=0.0 MDL=2.0D=0.0 MDL=2.0D=0.0 MDL=2.0D=0.0 MDL=2.0D=0.0 MDL=2.0D=0.0 MDL=2.0 D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0
0.193 0.253 0.135 0.113 0.007 0.356 0.293 0.195 0.190 0.291 0.098 0.227 0.003 0.228 0.258 0.227
0.198 0.271 0.147 0.116 0.007 0.356 0.301 0.189 0.182 0.280 0.098 0.162 0.003 0.244 0.258 0.231
0.204 0.240 0.157 0.145 0.009 0.356 0.327 0.194 0.184 0.226 0.099 0.229 0.004 0.245 0.258 0.227
0.194 0.221 0.154 0.145 0.008 0.359 0.275 0.196 0.194 0.291 0.097 0.240 0.003 0.217 0.281 0.229
0.200 0.258 0.152 0.098 0.007 0.358 0.327 0.197 0.192 0.293 0.097 0.232 0.004 0.228 0.258 0.230
D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0 D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0
0.211 0.320 0.042 0.262 0.001 0.325 0.311 0.215 0.210 0.327 0.127 0.218 0.021 0.237 0.311 0.229
0.201 0.257 0.032 0.262 0.001 0.323 0.311 0.222 0.210 0.322 0.126 0.218 0.021 0.237 0.320 0.229
0.208 0.299 0.035 0.261 0.001 0.326 0.311 0.220 0.211 0.322 0.127 0.218 0.021 0.237 0.320 0.229
0.204 0.281 0.029 0.262 0.001 0.325 0.311 0.220 0.211 0.321 0.128 0.218 0.021 0.237 0.320 0.229
0.207 0.292 0.041 0.263 0.001 0.326 0.311 0.216 0.209 0.310 0.127 0.218 0.021 0.237 0.320 0.229
D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5 D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5
0.216 0.317 0.065 0.249 0.001 0.357 0.320 0.203 0.204 0.322 0.123 0.212 0.016 0.275 0.247 0.233
0.211 0.295 0.052 0.247 0.000 0.355 0.320 0.209 0.204 0.322 0.123 0.212 0.016 0.275 0.247 0.236
0.216 0.314 0.074 0.248 0.001 0.357 0.320 0.198 0.205 0.323 0.123 0.206 0.016 0.275 0.252 0.237
0.212 0.308 0.046 0.249 0.001 0.356 0.320 0.209 0.204 0.320 0.125 0.208 0.016 0.275 0.246 0.236
0.211 0.293 0.063 0.248 0.000 0.354 0.320 0.201 0.204 0.323 0.125 0.208 0.016 0.275 0.249 0.232
D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0 D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0
0.217 0.304 0.048 0.244 0.000 0.390 0.311 0.219 0.206 0.319 0.117 0.229 0.010 0.255 0.277 0.233
0.218 0.313 0.062 0.244 0.000 0.388 0.311 0.208 0.207 0.317 0.126 0.239 0.010 0.255 0.268 0.233
0.221 0.309 0.084 0.243 0.000 0.389 0.311 0.209 0.205 0.319 0.127 0.224 0.010 0.255 0.268 0.233
0.213 0.285 0.057 0.243 0.000 0.387 0.311 0.210 0.206 0.307 0.127 0.240 0.010 0.255 0.268 0.233
0.211 0.295 0.036 0.244 0.000 0.387 0.311 0.205 0.204 0.305 0.127 0.240 0.010 0.255 0.259 0.233
D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0 D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0
0.203 0.294 0.030 0.248 0.000 0.335 0.315 0.196 0.192 0.323 0.108 0.248 0.009 0.201 0.250 0.205
0.208 0.306 0.059 0.248 0.000 0.334 0.315 0.193 0.190 0.321 0.107 0.237 0.009 0.201 0.251 0.205
0.205 0.310 0.050 0.248 0.000 0.334 0.315 0.178 0.193 0.322 0.122 0.243 0.009 0.201 0.249 0.205
0.207 0.300 0.063 0.248 0.001 0.333 0.313 0.192 0.192 0.321 0.122 0.243 0.010 0.201 0.245 0.205
0.210 0.330 0.050 0.246 0.001 0.336 0.315 0.191 0.192 0.323 0.122 0.243 0.009 0.201 0.240 0.205
IAIK
K-Means
Par
K-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-Means EMEMEMEMEMEMEMEMEMEMEM
Total BW DI GL HS IO IR SE SO VE VO Total BW DI GL HS IO IR SE SO VE VO
N
NN
σ 0.0
σ 0.2
σ 0.4
σ 0.6
σ 0.8
σ 0.0
σ 0.2
σ 0.4
σ 0.6
σ 0.8
σ 0.0
σ 0.2
σ 0.4
σ 0.6
σ 0.8
σ 0.0
σ 0.2
σ 0.4
σ 0.6
σ 0.8
σ 0.0
σ 0.2
σ 0.4
σ 0.6
σ 0.8
Raw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw Data
0.299 0.734 0.052 0.335 0.254 0.121 0.708 0.608 0.006 0.113 0.057 Not availableNot availableNot availableNot availableNot availableNot availableNot availableNot availableNot availableNot availableNot available
0.307 0.735 0.030 0.388 0.019 0.123 0.705 0.529 0.008 0.188 0.342 0.346 0.718 0.103 0.370 0.289 0.254 0.806 0.621 0.005 0.103 0.194
Semantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic Patterns
D=0.0 MDL=1.5D=0.0 MDL=1.5D=0.0 MDL=1.5D=0.0 MDL=1.5D=0.0 MDL=1.5D=0.0 MDL=1.5D=0.0 MDL=1.5D=0.0 MDL=1.5 D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0
0.315 0.724 0.039 0.329 0.309 0.045 0.717 0.582 0.026 0.198 0.183 0.317 0.777 0.006 0.312 0.239 0.218 0.651 0.592 0.016 0.174 0.186
0.323 0.724 0.025 0.334 0.344 0.071 0.730 0.590 0.012 0.198 0.196 0.327 0.752 0.001 0.318 0.240 0.218 0.766 0.598 0.016 0.167 0.197
0.318 0.719 0.026 0.285 0.316 0.051 0.769 0.600 0.008 0.199 0.203 0.323 0.727 0.011 0.287 0.229 0.217 0.749 0.600 0.018 0.176 0.218
0.317 0.722 0.025 0.298 0.357 0.040 0.712 0.602 0.013 0.199 0.201 0.317 0.732 0.009 0.316 0.232 0.221 0.637 0.606 0.025 0.175 0.214
0.299 0.646 0.015 0.294 0.328 0.026 0.686 0.581 0.014 0.198 0.200 0.325 0.703 0.006 0.305 0.233 0.216 0.796 0.594 0.019 0.181 0.195
D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0 D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0
0.333 0.817 0.072 0.293 0.338 0.181 0.611 0.614 0.009 0.164 0.234 0.302 0.579 0.082 0.332 0.285 0.184 0.633 0.634 0.006 0.099 0.183
0.333 0.817 0.076 0.278 0.340 0.181 0.621 0.621 0.009 0.151 0.237 0.300 0.579 0.082 0.307 0.285 0.184 0.636 0.632 0.006 0.117 0.176
0.326 0.817 0.068 0.286 0.335 0.181 0.587 0.604 0.009 0.149 0.228 0.301 0.579 0.086 0.310 0.285 0.184 0.639 0.643 0.006 0.095 0.183
0.327 0.817 0.072 0.269 0.337 0.181 0.604 0.580 0.009 0.166 0.232 0.301 0.579 0.076 0.319 0.285 0.184 0.639 0.632 0.006 0.109 0.185
0.334 0.817 0.071 0.303 0.336 0.181 0.610 0.605 0.011 0.163 0.244 0.300 0.579 0.079 0.311 0.285 0.184 0.633 0.633 0.006 0.109 0.183
D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5 D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5
0.352 0.817 0.099 0.298 0.382 0.143 0.751 0.601 0.018 0.193 0.218 0.339 0.579 0.086 0.348 0.324 0.242 0.761 0.596 0.013 0.187 0.252
0.358 0.817 0.100 0.330 0.385 0.163 0.751 0.588 0.015 0.194 0.232 0.339 0.579 0.086 0.356 0.324 0.242 0.761 0.595 0.012 0.192 0.239
0.352 0.817 0.096 0.315 0.387 0.143 0.738 0.576 0.019 0.193 0.231 0.340 0.579 0.092 0.348 0.324 0.242 0.761 0.603 0.012 0.194 0.241
0.348 0.817 0.103 0.288 0.383 0.158 0.716 0.579 0.015 0.194 0.226 0.339 0.579 0.094 0.355 0.324 0.242 0.761 0.602 0.012 0.181 0.240
0.356 0.817 0.098 0.296 0.378 0.166 0.776 0.604 0.012 0.190 0.225 0.338 0.579 0.107 0.355 0.324 0.242 0.752 0.597 0.012 0.177 0.236
D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0 D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0
0.329 0.817 0.054 0.339 0.330 0.064 0.752 0.563 0.017 0.151 0.199 0.323 0.579 0.105 0.347 0.266 0.228 0.784 0.585 0.015 0.092 0.227
0.328 0.817 0.052 0.320 0.330 0.064 0.753 0.585 0.017 0.144 0.196 0.325 0.579 0.098 0.359 0.266 0.228 0.784 0.584 0.015 0.098 0.238
0.331 0.817 0.055 0.313 0.330 0.109 0.767 0.562 0.012 0.149 0.194 0.323 0.579 0.105 0.358 0.266 0.228 0.784 0.576 0.015 0.090 0.230
0.330 0.817 0.059 0.335 0.328 0.073 0.765 0.560 0.019 0.148 0.199 0.326 0.579 0.099 0.351 0.266 0.228 0.798 0.595 0.015 0.091 0.235
0.333 0.817 0.064 0.321 0.330 0.068 0.764 0.593 0.013 0.158 0.200 0.326 0.579 0.104 0.361 0.266 0.228 0.798 0.585 0.015 0.090 0.237
D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0 D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0
0.322 0.817 0.026 0.326 0.333 0.099 0.739 0.567 0.022 0.136 0.153 0.304 0.579 0.001 0.362 0.200 0.228 0.728 0.574 0.032 0.114 0.224
0.322 0.817 0.029 0.326 0.320 0.127 0.702 0.583 0.017 0.150 0.150 0.307 0.579 0.000 0.364 0.208 0.228 0.735 0.573 0.029 0.113 0.236
0.317 0.817 0.035 0.318 0.320 0.099 0.705 0.556 0.024 0.140 0.154 0.306 0.579 0.001 0.355 0.211 0.228 0.726 0.572 0.035 0.113 0.237
0.328 0.817 0.026 0.342 0.328 0.118 0.759 0.563 0.020 0.150 0.153 0.307 0.579 0.001 0.363 0.219 0.228 0.729 0.575 0.029 0.113 0.233
0.323 0.817 0.029 0.330 0.322 0.099 0.731 0.563 0.023 0.151 0.161 0.304 0.579 0.001 0.356 0.204 0.224 0.713 0.589 0.030 0.119 0.226
IAIK
Distance
Data
Missing
EucEucEucEucEucEucEucEuc CosCosCosCosCosCosCosCos
RawRawRawRaw Semantic PatternsSemantic PatternsSemantic PatternsSemantic Patterns RawRawRawRaw Semantic PatternsSemantic PatternsSemantic PatternsSemantic Patterns
0% 10% 50% 90% 0% 10% 50% 90% 0% 10% 50% 90% 0% 10% 50% 90%
BC
DE
KR
LY
MU
SO
SP
VO
ZO
Total
AN
CO
CA
CG
HC
HH
HE
Total
BW
DI
GL
HS
IO
IR
SE
SO
VE
VO
Total
CategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategorical
0.52 0.52 0.52 0.52 0.54 0.54 0.53 0.50 0.53 0.53 0.53 0.51 0.54 0.54 0.53 0.51
0.68 0.66 0.55 0.32 0.81 0.80 0.38 0.22 0.66 0.66 0.67 0.36 0.81 0.80 0.74 0.46
0.54 0.54 0.53 0.52 0.52 0.52 0.51 0.50 0.54 0.54 0.53 0.51 0.52 0.52 0.52 0.51
0.63 0.68 0.63 0.30 0.63 0.59 0.64 0.48 0.59 0.53 0.51 0.32 0.61 0.58 0.56 0.35
0.64 0.64 0.62 0.57 0.68 0.67 0.62 0.53 0.57 0.57 0.56 0.54 0.67 0.67 0.67 0.62
0.65 0.63 0.53 0.22 0.75 0.70 0.09 0.08 0.58 0.56 0.50 0.18 0.73 0.72 0.63 0.28
0.48 0.47 0.44 0.38 0.62 0.46 0.39 0.39 0.44 0.44 0.41 0.37 0.57 0.57 0.54 0.45
0.80 0.79 0.76 0.67 0.78 0.78 0.68 0.51 0.62 0.63 0.67 0.62 0.79 0.79 0.78 0.72
0.83 0.81 0.72 0.31 0.86 0.85 0.64 0.24 0.80 0.79 0.71 0.31 0.86 0.84 0.76 0.41
0.64 0.64 0.59 0.42 0.69 0.66 0.50 0.38 0.59 0.58 0.57 0.41 0.68 0.67 0.64 0.48
MixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixed
0.64 0.63 0.55 0.38 0.66 0.67 0.51 0.38 0.44 0.46 0.50 0.38 0.66 0.66 0.61 0.42
0.59 0.59 0.56 0.51 0.59 0.58 0.52 0.50 0.50 0.50 0.51 0.51 0.62 0.62 0.60 0.57
0.62 0.61 0.59 0.54 0.65 0.65 0.60 0.52 0.55 0.55 0.54 0.51 0.65 0.64 0.63 0.57
0.52 0.52 0.52 0.50 0.52 0.53 0.54 0.53 0.51 0.51 0.52 0.51 0.52 0.52 0.52 0.52
0.86 0.86 0.85 0.81 0.87 0.87 0.85 0.81 0.81 0.81 0.82 0.81 0.87 0.87 0.86 0.84
0.87 0.86 0.85 0.82 0.87 0.87 0.83 0.80 0.84 0.84 0.83 0.81 0.88 0.88 0.87 0.83
0.59 0.58 0.56 0.50 0.64 0.64 0.58 0.55 0.52 0.51 0.55 0.52 0.65 0.65 0.64 0.57
0.67 0.67 0.64 0.58 0.69 0.69 0.63 0.58 0.60 0.60 0.61 0.58 0.69 0.69 0.68 0.62
NumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumerical
0.86 0.86 0.76 0.68 0.91 0.91 0.84 0.69 0.62 0.61 0.59 0.50 0.90 0.89 0.88 0.84
0.55 0.54 0.53 0.53 0.56 0.55 0.54 0.50 0.53 0.53 0.52 0.50 0.56 0.55 0.55 0.53
0.49 0.45 0.31 0.30 0.53 0.52 0.42 0.31 0.51 0.51 0.48 0.29 0.53 0.52 0.48 0.34
0.64 0.63 0.59 0.52 0.69 0.69 0.61 0.53 0.54 0.54 0.55 0.51 0.69 0.69 0.65 0.60
0.51 0.52 0.55 0.54 0.61 0.61 0.56 0.46 0.46 0.46 0.47 0.51 0.61 0.61 0.60 0.57
0.81 0.60 0.47 0.33 0.83 0.81 0.75 0.67 0.87 0.84 0.77 0.34 0.84 0.81 0.76 0.75
0.61 0.53 0.21 0.15 0.57 0.57 0.43 0.17 0.39 0.40 0.44 0.27 0.57 0.57 0.55 0.41
0.54 0.53 0.51 0.50 0.54 0.54 0.51 0.50 0.52 0.52 0.52 0.52 0.54 0.54 0.54 0.53
0.35 0.33 0.29 0.26 0.37 0.37 0.35 0.28 0.36 0.36 0.36 0.31 0.37 0.37 0.36 0.33
0.15 0.15 0.12 0.09 0.22 0.21 0.16 0.10 0.20 0.20 0.17 0.10 0.21 0.21 0.20 0.13
0.55 0.51 0.43 0.39 0.58 0.58 0.52 0.42 0.50 0.50 0.49 0.38 0.58 0.58 0.56 0.50
IAIK
Data set EUC (N) EUC (NN) COS (NN) EUC (NN) COS (NN) EUC (NN) COS (NN)
BC
DE
KR
LY
MU
SO
SP
VO
ZO
Total
AN
CO
CA
CG
HC
HH
HE
Total
BW
DI
GL
HS
IO
IR
SE
SO
VE
VO
Total
RAWRAWRAW BaselineBaseline Semantic PatternsSemantic Patterns
CategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategorical
0.52 0.53 0.53 0.52 0.53 0.54 0.54
0.68 0.68 0.66 0.67 0.67 0.81 0.81
0.54 0.54 0.54 0.54 0.54 0.52 0.52
0.63 0.63 0.59 0.60 0.57 0.63 0.61
0.64 0.64 0.57 0.64 0.64 0.68 0.67
0.65 0.65 0.58 0.69 0.70 0.75 0.73
0.48 0.48 0.44 0.48 0.48 0.62 0.57
0.80 0.80 0.62 0.80 0.80 0.78 0.79
0.84 0.83 0.80 0.85 0.84 0.86 0.86
0.64 0.64 0.59 0.64 0.64 0.69 0.68
MixedMixedMixedMixedMixedMixedMixed
0.64 0.64 0.44 0.64 0.65 0.65 0.66
0.59 0.59 0.50 0.59 0.60 0.58 0.62
0.62 0.62 0.55 0.61 0.61 0.61 0.65
0.52 0.52 0.51 0.52 0.52 0.52 0.52
0.86 0.86 0.81 0.85 0.85 0.86 0.87
0.87 0.87 0.84 0.86 0.86 0.86 0.88
0.59 0.59 0.52 0.61 0.60 0.63 0.65
0.67 0.67 0.60 0.67 0.67 0.67 0.69
NumericalNumericalNumericalNumericalNumericalNumericalNumerical
0.86 0.86 0.62 0.74 0.74 0.89 0.90
0.55 0.55 0.53 0.54 0.54 0.55 0.56
0.49 0.49 0.51 0.51 0.51 0.53 0.53
0.64 0.64 0.54 0.63 0.63 0.66 0.69
0.51 0.51 0.46 0.55 0.55 0.63 0.61
0.81 0.81 0.87 0.73 0.73 0.81 0.83
0.61 0.61 0.39 0.54 0.54 0.57 0.57
0.54 0.54 0.52 0.54 0.54 0.54 0.54
0.35 0.35 0.36 0.37 0.37 0.36 0.37
0.15 0.15 0.20 0.21 0.21 0.22 0.21
0.55 0.55 0.50 0.54 0.54 0.58 0.58

Más contenido relacionado

Destacado

Security and Encryption on iOS
Security and Encryption on iOSSecurity and Encryption on iOS
Security and Encryption on iOSGraham Lee
 
AirWatch Solution Overview
AirWatch Solution OverviewAirWatch Solution Overview
AirWatch Solution OverviewProyet Kft
 
iOS-Application-Security-iAmPr3m
iOS-Application-Security-iAmPr3miOS-Application-Security-iAmPr3m
iOS-Application-Security-iAmPr3mPrem Kumar (OSCP)
 
IOS Encryption Systems
IOS Encryption SystemsIOS Encryption Systems
IOS Encryption SystemsPeter Teufl
 
Mobile Sicherheit Basics
Mobile Sicherheit BasicsMobile Sicherheit Basics
Mobile Sicherheit BasicsLookout
 
Live Hacking – Wie (un)sicher sind Android, iPhone, Blackberry & Co.?
Live Hacking – Wie (un)sicher sind Android, iPhone, Blackberry & Co.?Live Hacking – Wie (un)sicher sind Android, iPhone, Blackberry & Co.?
Live Hacking – Wie (un)sicher sind Android, iPhone, Blackberry & Co.?Connected-Blog
 
OWASP Melbourne - Introduction to iOS Application Penetration Testing
OWASP Melbourne - Introduction to iOS Application Penetration TestingOWASP Melbourne - Introduction to iOS Application Penetration Testing
OWASP Melbourne - Introduction to iOS Application Penetration Testingeightbit
 
Usability trifft IT-Sicherheit: Eine besondere Herausforderung für mobile Bus...
Usability trifft IT-Sicherheit: Eine besondere Herausforderung für mobile Bus...Usability trifft IT-Sicherheit: Eine besondere Herausforderung für mobile Bus...
Usability trifft IT-Sicherheit: Eine besondere Herausforderung für mobile Bus...usability.de
 

Destacado (9)

Security and Encryption on iOS
Security and Encryption on iOSSecurity and Encryption on iOS
Security and Encryption on iOS
 
MDM - airwatch
MDM - airwatchMDM - airwatch
MDM - airwatch
 
AirWatch Solution Overview
AirWatch Solution OverviewAirWatch Solution Overview
AirWatch Solution Overview
 
iOS-Application-Security-iAmPr3m
iOS-Application-Security-iAmPr3miOS-Application-Security-iAmPr3m
iOS-Application-Security-iAmPr3m
 
IOS Encryption Systems
IOS Encryption SystemsIOS Encryption Systems
IOS Encryption Systems
 
Mobile Sicherheit Basics
Mobile Sicherheit BasicsMobile Sicherheit Basics
Mobile Sicherheit Basics
 
Live Hacking – Wie (un)sicher sind Android, iPhone, Blackberry & Co.?
Live Hacking – Wie (un)sicher sind Android, iPhone, Blackberry & Co.?Live Hacking – Wie (un)sicher sind Android, iPhone, Blackberry & Co.?
Live Hacking – Wie (un)sicher sind Android, iPhone, Blackberry & Co.?
 
OWASP Melbourne - Introduction to iOS Application Penetration Testing
OWASP Melbourne - Introduction to iOS Application Penetration TestingOWASP Melbourne - Introduction to iOS Application Penetration Testing
OWASP Melbourne - Introduction to iOS Application Penetration Testing
 
Usability trifft IT-Sicherheit: Eine besondere Herausforderung für mobile Bus...
Usability trifft IT-Sicherheit: Eine besondere Herausforderung für mobile Bus...Usability trifft IT-Sicherheit: Eine besondere Herausforderung für mobile Bus...
Usability trifft IT-Sicherheit: Eine besondere Herausforderung für mobile Bus...
 

Similar a Semantic Pattern Transformation

Rietta Business Intelligence for the MicroISV
Rietta Business Intelligence for the MicroISVRietta Business Intelligence for the MicroISV
Rietta Business Intelligence for the MicroISVFrank Rietta
 
AI Modernization at AT&T and the Application to Fraud with Databricks
AI Modernization at AT&T and the Application to Fraud with DatabricksAI Modernization at AT&T and the Application to Fraud with Databricks
AI Modernization at AT&T and the Application to Fraud with DatabricksDatabricks
 
IRJET- New Generation Multilevel based Atm Security System
IRJET- New Generation Multilevel based Atm Security SystemIRJET- New Generation Multilevel based Atm Security System
IRJET- New Generation Multilevel based Atm Security SystemIRJET Journal
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software EngineeringMiroslaw Staron
 
IRJET - New Generation Multilevel based Atm Security System
IRJET - New Generation Multilevel based Atm Security SystemIRJET - New Generation Multilevel based Atm Security System
IRJET - New Generation Multilevel based Atm Security SystemIRJET Journal
 
Classroom Attendance using Face Detection and Raspberry-Pi
Classroom Attendance using Face Detection and Raspberry-PiClassroom Attendance using Face Detection and Raspberry-Pi
Classroom Attendance using Face Detection and Raspberry-PiIRJET Journal
 
Dive into H2O: NYC
Dive into H2O: NYCDive into H2O: NYC
Dive into H2O: NYCSri Ambati
 
DECLERCQ Timeline Survey: Measuring the evolution of audiovisual archives in...
DECLERCQ Timeline Survey: Measuring the evolution of audiovisual archives  in...DECLERCQ Timeline Survey: Measuring the evolution of audiovisual archives  in...
DECLERCQ Timeline Survey: Measuring the evolution of audiovisual archives in...Brecht Declercq
 
FP&A with Spreadsheets and Spark with Oscar Castaneda-Villagran
FP&A with Spreadsheets and Spark with Oscar Castaneda-VillagranFP&A with Spreadsheets and Spark with Oscar Castaneda-Villagran
FP&A with Spreadsheets and Spark with Oscar Castaneda-VillagranDatabricks
 
A process to improve the accuracy of mk ii fp to cosmic charles symons
A process to improve the accuracy of mk ii fp to cosmic    charles symonsA process to improve the accuracy of mk ii fp to cosmic    charles symons
A process to improve the accuracy of mk ii fp to cosmic charles symonsIWSM Mensura
 
2020 09-16-ai-engineering challanges
2020 09-16-ai-engineering challanges2020 09-16-ai-engineering challanges
2020 09-16-ai-engineering challangesIvica Crnkovic
 
IRJET- Cheque Bounce Detection System using Image Processing
IRJET- Cheque Bounce Detection System using Image ProcessingIRJET- Cheque Bounce Detection System using Image Processing
IRJET- Cheque Bounce Detection System using Image ProcessingIRJET Journal
 
DATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITODATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITOMarcoMellia
 
Data Science for Smart Manufacturing
Data Science for Smart ManufacturingData Science for Smart Manufacturing
Data Science for Smart ManufacturingCarlo Torniai
 
IEEE projects in IOT for B.E / B.Tech Students at SLN Technologies
IEEE projects in IOT for B.E / B.Tech Students at SLN Technologies IEEE projects in IOT for B.E / B.Tech Students at SLN Technologies
IEEE projects in IOT for B.E / B.Tech Students at SLN Technologies SLN Technologies - Chennai
 
Slides for automate or die (presentation)
Slides for automate or die (presentation)Slides for automate or die (presentation)
Slides for automate or die (presentation)Displayr
 
EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?confluent
 

Similar a Semantic Pattern Transformation (20)

Rietta Business Intelligence for the MicroISV
Rietta Business Intelligence for the MicroISVRietta Business Intelligence for the MicroISV
Rietta Business Intelligence for the MicroISV
 
AI Modernization at AT&T and the Application to Fraud with Databricks
AI Modernization at AT&T and the Application to Fraud with DatabricksAI Modernization at AT&T and the Application to Fraud with Databricks
AI Modernization at AT&T and the Application to Fraud with Databricks
 
IRJET- New Generation Multilevel based Atm Security System
IRJET- New Generation Multilevel based Atm Security SystemIRJET- New Generation Multilevel based Atm Security System
IRJET- New Generation Multilevel based Atm Security System
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 
Knowledge Discovery in Production
Knowledge Discovery in ProductionKnowledge Discovery in Production
Knowledge Discovery in Production
 
Tutorial_Caf_2020_etool.pdf
Tutorial_Caf_2020_etool.pdfTutorial_Caf_2020_etool.pdf
Tutorial_Caf_2020_etool.pdf
 
IRJET - New Generation Multilevel based Atm Security System
IRJET - New Generation Multilevel based Atm Security SystemIRJET - New Generation Multilevel based Atm Security System
IRJET - New Generation Multilevel based Atm Security System
 
Classroom Attendance using Face Detection and Raspberry-Pi
Classroom Attendance using Face Detection and Raspberry-PiClassroom Attendance using Face Detection and Raspberry-Pi
Classroom Attendance using Face Detection and Raspberry-Pi
 
Dive into H2O: NYC
Dive into H2O: NYCDive into H2O: NYC
Dive into H2O: NYC
 
DECLERCQ Timeline Survey: Measuring the evolution of audiovisual archives in...
DECLERCQ Timeline Survey: Measuring the evolution of audiovisual archives  in...DECLERCQ Timeline Survey: Measuring the evolution of audiovisual archives  in...
DECLERCQ Timeline Survey: Measuring the evolution of audiovisual archives in...
 
FP&A with Spreadsheets and Spark with Oscar Castaneda-Villagran
FP&A with Spreadsheets and Spark with Oscar Castaneda-VillagranFP&A with Spreadsheets and Spark with Oscar Castaneda-Villagran
FP&A with Spreadsheets and Spark with Oscar Castaneda-Villagran
 
A process to improve the accuracy of mk ii fp to cosmic charles symons
A process to improve the accuracy of mk ii fp to cosmic    charles symonsA process to improve the accuracy of mk ii fp to cosmic    charles symons
A process to improve the accuracy of mk ii fp to cosmic charles symons
 
2020 09-16-ai-engineering challanges
2020 09-16-ai-engineering challanges2020 09-16-ai-engineering challanges
2020 09-16-ai-engineering challanges
 
IRJET- Cheque Bounce Detection System using Image Processing
IRJET- Cheque Bounce Detection System using Image ProcessingIRJET- Cheque Bounce Detection System using Image Processing
IRJET- Cheque Bounce Detection System using Image Processing
 
DATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITODATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITO
 
Data Science for Smart Manufacturing
Data Science for Smart ManufacturingData Science for Smart Manufacturing
Data Science for Smart Manufacturing
 
IEEE projects in IOT for B.E / B.Tech Students at SLN Technologies
IEEE projects in IOT for B.E / B.Tech Students at SLN Technologies IEEE projects in IOT for B.E / B.Tech Students at SLN Technologies
IEEE projects in IOT for B.E / B.Tech Students at SLN Technologies
 
Spark Technology Center IBM
Spark Technology Center IBMSpark Technology Center IBM
Spark Technology Center IBM
 
Slides for automate or die (presentation)
Slides for automate or die (presentation)Slides for automate or die (presentation)
Slides for automate or die (presentation)
 
EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?
 

Último

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Semantic Pattern Transformation

  • 1. IAIK Semantic Pattern Transformation IKNOW 2013 Peter Teufl, Herbert Leitold, Reinhard Posch peter.teufl@iaik.tugraz.at
  • 2. IAIK Our Background Topics Mobile device security Cloud security Security consulting for public insititutions (Austria) IT security research IT security lectures e-Government A-SIT
  • 3. IAIK Why does he talk about Knowledge Discovery? How does IT security relate to knowledge discovery? eGov - eParticipation: document analysis, twitter etc. intrusion detection systems (network traffic analysis) malware detection (network traffic, mobile phones) mobile application analysis (metadata, market descriptions) mobile application security (hot topic, BYOD, etc.)
  • 4. IAIK What to expect? Motivation for the Semantic Pattern Transformation Basic concepts, techniques How does it work? Evaluation? Applications, results, current topics!
  • 5. IAIK Environment Arbitrary features No apriori knowledge Heteregenous domains Clustering Supervised learning Anomaly Detection Semantic search Visualization Extracting knowledge Text analysis Android market descriptions histograms flexible deployment new domains terms numbers
  • 6. IAIK Process... •Different processing steps •From defining the goals •To extracting the desired knowledge •Machine learning algorithms are often used within KDD •However, the complete machine learning process is quite similar to KDD Knowledge discovery goals Target data set Preprocessing Data extraction Data mining method Data mining algorithm Knowledge extraction Data mining Knowledge processing Fayyad et al. Machine learning Domain-specific data set KDT Machine learning goals Instance extraction Feature selection, construction Instance selection Machine learning algorithm Preprocessing Algorithm application Interpretation ML-KDT
  • 7. IAIK ADAPTATION COMPLEXITY? •Assuming an arbitrary data-set (e-Participation, Android Market applications) •Further assuming: a knowledge discovery goal: e.g., unsupervised clustering •Then: we need to adapt the steps on the left •And: We need to adapt this setup when the data changes, even when the knowledge discovery goals remain the same! •Android Market applications vs. text documents vs. network traffic vs. malware detection? Domain-specific data set Machine learning goals Instance extraction Feature selection, construction Instance selection Algorithm selection Preprocessing Algorithm application Interpretation Machine Learning High Dependence on domain data and goals Medium Low
  • 8. IAIK TOWARDS A SEMANTIC REPRESENTATION •Finding a new representation... •New representation is called Semantic Patterns •Key properties: •Still a vector representation (compatible to old representation) •Not the feature values themselves, but their semantic relations are represented •All values have the same meaning and feature type (activation) •Transformation from raw data into Semantic Patterns: Semantic Pattern Transformation
  • 9. IAIK SEMANTIC PATTERN TRANSFORMATION •The Semantic Pattern Transformation is arranged in five layers •Layer 1 - Feature extraction •Layer 2 - Associative network - Node generation •Layer 3 - Associative network - Link generation •Layer 4 - Spreading activation (SA) •Layer 5 - Analysis (machine learning, semantic search etc.) Data set Relation FROM TO TIME FROM TO TIME FROM TO TIME SF 2 Instance SF 1 DF 1 DF 2SF 2 SV MV SV SV SV MV SV MV MV P 1 P 3 P 4 P 2 Supervised learning Unsupervised clustering Semantic relations Feature value relevance Anomaly detection Semantic development over time Pattern similarity Layer 1 Feature Extraction Layer 2 - 3 Associative Network Generation Layer 4 Spreading Activation Layer 5 Analysis SF 2 Instances Map Map Map
  • 10. IAIK SPT: Layer 1 - Feature extraction Extract features, their values and determine the type (categorical, distance-based) Categorical: Exports Distance-based: Unemployment rate, fertility rate Country Exports Unemployment rate Fertility rate C1 coffee 20% 5 C2 cacao 20% 5 C3 coffee, cacao 20% 5 C4 machinery 5% 2 C5 chemicals 5% 2 C6 chemicals, machinery 5% 2 C7 chemicals, cacao 20% missing data C8 missing data 20% 5 C9 coffee, cacao missing data missing data
  • 11. IAIK SPT: Layer 2 - Node generation 20% 5% coffee cocoa machinery chemicals 5 2 Country Exports Unemployment rate Fertility rate C1 coffee 20% 5 C2 cacao 20% 5 C3 coffee, cacao 20% 5 C4 machinery 5% 2 C5 chemicals 5% 2 C6 chemicals, machinery 5% 2 C7 chemicals, cacao 20% missing data C8 missing data 20% 5 C9 coffee, cacao missing data missing data Categorical feature values: one node for each value Distance-based feature values: map value ranges to single nodes Associative network
  • 12. IAIK SPT: Layer 3 - Link generation 0.25 0.75 0.5 Link Weight 1.00 20% 5 5% coffee cocoa machinery chemicals 2 Country Exports Unemployment rate Fertility rate C1 coffee 20% 5 C2 cacao 20% 5 C3 coffee, cacao 20% 5 C4 machinery 5% 2 C5 chemicals 5% 2 C6 chemicals, machinery 5% 2 C7 chemicals, cacao 20% missing data C8 missing data 20% 5 C9 coffee, cacao missing data missing data coffee, 20%, 5 chemicals, cacao, 20%
  • 13. IAIK SPT: Layer 4 - Spreading activation Creating a Semantic Pattern: in this case for “coffee” and “cacao” Set activation value of the two nodes to 1.0 Spread this activation value to neighboring nodes via the weighted links 20% 5 5% coffee cocoa machinery chemicals 2 1.0 1.0
  • 14. IAIK SPT: Layer 4 - Spreading activation Typically, one would create Semantic Patterns for all instances within the data set E.g. a pattern for C1 by activating coffee, 20% and 5 However, we can also create patterns for feature values: e.g. “coffee” Country Exports Unemployment rate Fertility rate C1 coffee 20% 5 C2 cacao 20% 5 C3 coffee, cacao 20% 5 C4 machinery 5% 2 C5 chemicals 5% 2 C6 chemicals, machinery 5% 2 C7 chemicals, cacao 20% missing data C8 missing data 20% 5 C9 coffee, cacao missing data missing data
  • 15. IAIK SPT: Layer 4 - Spreading activation After SA: each node in the network has an activation value By representing the nodes and their activation values as a vector, we gain a Semantic Pattern coffee cocoa machinery chemicals 20% 5% 5 2 0.00 0.08 0.38 0.300.00 0.001.151.15 cocoa 1.15 coffee 1.15 20% 0.38 5 0.30 chemicals 0.08 2 0.00 5% 0.00 machinery 0.00
  • 16. IAIK 0 0.25 0.50 coffee cacao machinery chemicals 20% 5% 5 2 Export: Cacao Unsorted Semantic Pattern 0 0.25 0.50 coffee cacao machinery chemicals 20% 5% 5 2 Export: Coffee Unsorted Semantic Pattern 0 0.25 0.50 coffee cacao machinery chemicals 20% 5% 5 2 Fertility: 2 Unsorted Semantic Pattern Country Exports Unemployment rate Fertility rate C1 coffee 20% 5 C2 cacao 20% 5 C3 coffee, cacao 20% 5 C4 machinery 5% 2 C5 chemicals 5% 2 C6 chemicals, machinery 5% 2 C7 chemicals, cacao 20% missing data C8 missing data 20% 5 C9 coffee, cacao missing data missing data Each feature value is represented by a semantic fingerprint Allows for an instant analysis of semantic relations to other feature values Sort, mean, variance, adding, subtracting
  • 17. IAIK SPT: Layer 5 - Analysis Calculating the distance between two patterns (Euclidean distance, Cosine similarity) For unsupervised clustering, semantic- aware search algorithms Keyword search for coffeeKeyword search for coffeeKeyword search for coffeeKeyword search for coffee C1 coffee 20% 5 C3 coffee, cacao 20% 5 C9 coffee, cacao missing data missing data Semantic aware search for coffeeSemantic aware search for coffeeSemantic aware search for coffeeSemantic aware search for coffee C9 coffee, cacao missing data missing data C1 coffee 20% 5 C3 coffee, cacao 20% 5 C2 cacao 20% 5 C8 missing data 20% 5 C7 chemicals, cacao 20% missing data C5 chemicals 5% 2 C6 chemicals, machinery 5% 2 C4 machinery 5% 2
  • 18. IAIK SPT: Layer 5 - Analysis Machine learning: apply any machine learning algorithm to the Semantic Patterns Unsupervised clustering Supervised learning Semantic-aware search Knowledge discovery: semantic relations, arbitrary procedures: mean, variance etc. Anomaly detection, feature relevance, simple operations (variance, mean, etc.) Visualization
  • 19. IAIK Benefits? Domain-specific data set Machine learning goals Instance extraction Feature selection, construction Instance selection Algorithm selection Preprocessing Algorithm application Interpretation Machine Learning Domain-specific data set Machine learning goals Instance extraction Feature selection, construction Instance selection Algorithm selection Preprocessing Algorithm application Interpretation High Dependence on domain data and goals Medium Low Application in heterogeneous domains regardless of the nature of the data Except for Layer 1, we do not need any manual setup for the layers Regardless of the analyzed data, the Semantic Patterns always use the same model This means: Regardless of the deployed knowledge discovery method, we can always use the same methods for knowledge extraction!
  • 20. IAIK Comparing the two models Country Coffee Cacao Machinery Chemicals 20% 5% 5 2 C1 1.30 0.53 0.00 0.08 1.45 0.00 1.45 0.00 C2 0.45 1.38 0.00 0.15 1.53 0.00 1.45 0.00 C3 1.45 1.53 0.00 0.15 1.68 0.00 1.60 0.00 C4 0.00 0.00 1.30 0.38 0.00 1.38 0.00 1.38 C5 0.00 0.08 0.38 1.30 0.08 1.38 0.00 1.38 C6 0.00 0.08 1.37 1.37 0.08 1.53 0.00 1.53 C7 0.30 1.30 0.08 1.15 1.30 0.15 0.45 0.15 C8 0.30 0.38 0.00 0.08 1.30 0.00 1.30 0.00 C9 1.15 1.15 0.00 0.08 0.38 0.00 0.30 0.00 0 0.75 1.50 coffee cacao machinery chemicals 20% 5% 5 2 Mean pattern: C4, C5, C6 Unsorted Semantic Pattern 0 1.00 2.00 coffee cacao machinery chemicals 20% 5% 5 2 Mean pattern: C1, C2, C3 Unsorted Semantic Pattern Country Coffee Cacao Machinery Chemicals Unemployment rate Fertility rate C1 1 0 0 0 20% 5 C2 0 1 0 0 20% 5 C3 1 1 0 0 20% 5 C4 0 0 1 0 5% 2 C5 0 0 0 1 5% 2 C6 0 0 1 1 5% 2 C7 0 1 0 1 20% missing data C8 missing datamissing datamissing datamissing data 20% 5 C9 1 1 0 0 missing data missing data Same model: Android application, a country or a document... the activation values always have the same meaning Semantic Patterns Value-centric feature vectors
  • 21. IAIK Evaluation 26 data sets from the UCI machine learning repository Supervised: SVM Unsupervised: EM and k-Means Application to raw data and to Semantic Patterns Data set Label Inst DF SF Classes SVM (N) SVM (NN) SVM (P) KM (N) KM (NN) KM (P) EM (NN) EM (P) Breast Cancer BC Dermatology DE KR vs. KP KR Lymph LY Mushroom MU Soybean SO Splice SP Vote VO Zoo ZO Anneal AN Colic CO Credit-A CA Credit-G CG Heart-C HC Heart-H HH Hepatitis HE Breast-w BW Diabetes DI Glass GL Heart-Statlog HS Ionosphere IO Iris IR Segment SE Sonar SO Vehicle VE Vowel VO SVMSVMSVM K-MeansK-MeansK-Means EMEM SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2 CategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategorical 286 9 2 0.03 0.04 0.04 0.01 0.01 0.06 0.00 0.08 366 1 33 6 0.93 0.92 0.95 0.58 0.09 0.86 0.87 0.87 3196 36 2 0.75 0.75 0.72 0.00 0.01 0.00 0.04 0.00 148 18 4 0.53 0.51 0.48 0.13 0.18 0.25 0.26 0.27 8124 22 2 1.00 1.00 1.00 0.48 0.47 0.45 0.61 0.59 683 35 19 0.92 0.92 0.93 0.59 0.62 0.73 0.79 0.79 3190 60 3 0.71 0.72 0.80 0.03 0.03 0.44 0.41 0.31 435 16 2 0.76 0.74 0.67 0.47 0.48 0.47 0.49 0.45 101 17 7 0.94 0.94 0.97 0.78 0.78 0.82 0.82 0.85 TotalTotalTotalTotal 0.73 0.73 0.73 0.34 0.30 0.45 0.48 0.47 MixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixed 898 6 32 6 0.86 0.86 0.92 0.23 0.03 0.30 0.31 0.32 368 7 15 2 0.31 0.32 0.31 0.13 0.03 0.05 0.10 0.12 689 6 9 2 0.41 0.41 0.39 0.16 0.02 0.25 0.17 0.21 1000 7 13 2 0.11 0.10 0.12 0.01 0.01 0.00 0.01 0.02 303 6 7 5 0.36 0.36 0.29 0.24 0.01 0.36 0.31 0.28 294 6 7 5 0.32 0.31 0.33 0.27 0.01 0.32 0.28 0.25 155 5 14 2 0.25 0.28 0.21 0.13 0.00 0.21 0.22 0.24 TotalTotalTotalTotal 0.37 0.38 0.37 0.17 0.02 0.21 0.20 0.20 NumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumerical 699 9 2 0.78 0.78 0.77 0.73 0.74 0.82 0.72 0.58 768 8 2 0.18 0.18 0.15 0.05 0.03 0.10 0.10 0.08 214 9 7 0.30 0.30 0.50 0.34 0.39 0.33 0.37 0.36 270 13 2 0.36 0.36 0.37 0.25 0.02 0.39 0.29 0.27 351 34 2 0.48 0.48 0.50 0.12 0.12 0.16 0.25 0.25 150 4 3 0.87 0.87 0.87 0.71 0.71 0.75 0.81 0.78 2310 19 7 0.88 0.88 0.90 0.61 0.53 0.59 0.62 0.60 208 60 2 0.23 0.23 0.23 0.01 0.01 0.02 0.01 0.01 846 18 4 0.51 0.51 0.48 0.11 0.19 0.19 0.10 0.19 990 10 3 11 0.63 0.63 0.76 0.06 0.34 0.23 0.19 0.25 TotalTotalTotalTotal 0.52 0.52 0.55 0.30 0.31 0.36 0.35 0.34
  • 22. IAIK •Applications described in several publications, which analyze •e-Participation (Egyptian revolution, Fukoshima, Mitmachen): text documents •Intrusion detection: event correlation •RDF data analysis (semantic web) •WiFi privacy (analyzing captured emails) •Android Market application analysis DOES IT WORK?
  • 23. IAIK Current Project Android application security Container applications for BYOD (require encryption, secure communication, key derivation functions, root checks etc.) Manual analysis is cumbersome Semantic Patterns Extract Dalvik VM code, features (opcodes, methods, local variables etc.) Apply Semantic Patterns technique Clustering, supervised learning, anomaly detection etc.
  • 25. IAIK Current Project Also works directly on the phone... Detecting SMS catchers/sniffers More fine grained detection assymmetric cryptography symmetric cryptography
  • 26. IAIK Outlook Publish the Java API... basically a converter from arbitrary feature vectors to Semantic Patterns (e.g. in/out in ARFF format) Deep learning...
  • 28. IAIK K-Means Par K-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-Means EMEMEMEMEMEMEMEMEMEM Total BC DE KR LY MU SO SP VO ZO Total BC DE KR LY MU SO SP VO ZO N NN D 0.0 D 0.1 D 0.3 D 0.5 D 0.7 D 0.1 D 0.3 D 0.5 D 0.7 D 0.1 D 0.3 D 0.5 D 0.7 D 0.1 D 0.3 D 0.5 D 0.7 Raw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw Data 0.341 0.012 0.584 0.004 0.131 0.475 0.587 0.031 0.467 0.782 Not availableNot availableNot availableNot availableNot availableNot availableNot availableNot availableNot availableNot available 0.296 0.007 0.094 0.010 0.176 0.472 0.616 0.030 0.476 0.783 0.477 0.002 0.871 0.036 0.258 0.610 0.789 0.410 0.494 0.822 Semantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic Patterns 0.443 0.025 0.849 0.003 0.199 0.413 0.728 0.465 0.493 0.814 0.449 0.004 0.767 0.001 0.222 0.590 0.740 0.423 0.489 0.801 Comb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=LComb=E Norm=L 0.442 0.029 0.811 0.004 0.245 0.545 0.726 0.387 0.476 0.759 0.441 0.074 0.885 0.000 0.271 0.615 0.786 0.004 0.505 0.826 0.447 0.068 0.846 0.004 0.241 0.482 0.724 0.424 0.476 0.758 0.460 0.079 0.875 0.001 0.258 0.592 0.788 0.250 0.449 0.846 0.452 0.061 0.856 0.000 0.245 0.448 0.733 0.437 0.467 0.820 0.468 0.079 0.874 0.001 0.265 0.592 0.789 0.306 0.452 0.850 0.422 0.069 0.826 0.000 0.209 0.275 0.728 0.419 0.463 0.804 0.465 0.079 0.874 0.001 0.252 0.579 0.799 0.312 0.445 0.847 Comb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=LComb=S Norm=L 0.441 0.056 0.853 0.000 0.244 0.453 0.733 0.399 0.476 0.759 0.433 0.079 0.872 0.001 0.270 0.572 0.794 0.001 0.476 0.829 0.434 0.075 0.820 0.000 0.228 0.411 0.718 0.431 0.472 0.750 0.466 0.079 0.881 0.001 0.280 0.592 0.802 0.298 0.437 0.828 0.439 0.060 0.792 0.000 0.235 0.416 0.741 0.405 0.463 0.836 0.466 0.079 0.871 0.001 0.251 0.581 0.805 0.310 0.445 0.848 0.422 0.067 0.798 0.000 0.224 0.364 0.726 0.376 0.462 0.782 0.462 0.087 0.875 0.001 0.254 0.580 0.776 0.292 0.445 0.845 Comb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=SComb=E Norm=S 0.418 0.029 0.790 0.006 0.236 0.311 0.705 0.449 0.496 0.742 0.472 0.002 0.893 0.000 0.263 0.571 0.767 0.432 0.495 0.820 0.452 0.030 0.860 0.001 0.231 0.470 0.715 0.475 0.491 0.799 0.476 0.002 0.914 0.000 0.261 0.586 0.775 0.427 0.495 0.823 0.448 0.048 0.799 0.009 0.215 0.539 0.725 0.450 0.493 0.758 0.472 0.002 0.897 0.000 0.267 0.584 0.758 0.427 0.484 0.829 0.448 0.033 0.850 0.000 0.230 0.495 0.712 0.435 0.493 0.787 0.473 0.002 0.903 0.000 0.250 0.586 0.773 0.427 0.484 0.829 Comb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=SComb=S Norm=S 0.439 0.029 0.806 0.009 0.250 0.435 0.727 0.439 0.494 0.760 0.475 0.002 0.903 0.000 0.254 0.576 0.764 0.429 0.495 0.852 0.420 0.015 0.775 0.004 0.210 0.436 0.717 0.409 0.443 0.774 0.474 0.002 0.901 0.000 0.271 0.584 0.763 0.427 0.484 0.837 0.429 0.030 0.789 0.009 0.226 0.410 0.716 0.448 0.485 0.749 0.476 0.002 0.904 0.000 0.255 0.586 0.767 0.427 0.484 0.854 0.438 0.040 0.839 0.006 0.246 0.418 0.726 0.409 0.480 0.775 0.480 0.002 0.910 0.000 0.269 0.615 0.771 0.431 0.494 0.825
  • 29. IAIK K-Means Par K-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-Means EMEMEMEMEMEMEMEM Total AN CO CA CG HC HH HE Total AN CO CA CG HC HH HE N NN σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 Raw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw Data 0.165 0.226 0.129 0.155 0.009 0.237 0.269 0.131 Not availableNot availableNot availableNot availableNot availableNot availableNot availableNot available 0.017 0.028 0.030 0.016 0.012 0.014 0.012 0.004 0.201 0.312 0.103 0.171 0.013 0.309 0.278 0.223 Semantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic Patterns D=0.0 MDL=2.0D=0.0 MDL=2.0D=0.0 MDL=2.0D=0.0 MDL=2.0D=0.0 MDL=2.0D=0.0 MDL=2.0D=0.0 MDL=2.0D=0.0 MDL=2.0 D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0 0.193 0.253 0.135 0.113 0.007 0.356 0.293 0.195 0.190 0.291 0.098 0.227 0.003 0.228 0.258 0.227 0.198 0.271 0.147 0.116 0.007 0.356 0.301 0.189 0.182 0.280 0.098 0.162 0.003 0.244 0.258 0.231 0.204 0.240 0.157 0.145 0.009 0.356 0.327 0.194 0.184 0.226 0.099 0.229 0.004 0.245 0.258 0.227 0.194 0.221 0.154 0.145 0.008 0.359 0.275 0.196 0.194 0.291 0.097 0.240 0.003 0.217 0.281 0.229 0.200 0.258 0.152 0.098 0.007 0.358 0.327 0.197 0.192 0.293 0.097 0.232 0.004 0.228 0.258 0.230 D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0 D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0 0.211 0.320 0.042 0.262 0.001 0.325 0.311 0.215 0.210 0.327 0.127 0.218 0.021 0.237 0.311 0.229 0.201 0.257 0.032 0.262 0.001 0.323 0.311 0.222 0.210 0.322 0.126 0.218 0.021 0.237 0.320 0.229 0.208 0.299 0.035 0.261 0.001 0.326 0.311 0.220 0.211 0.322 0.127 0.218 0.021 0.237 0.320 0.229 0.204 0.281 0.029 0.262 0.001 0.325 0.311 0.220 0.211 0.321 0.128 0.218 0.021 0.237 0.320 0.229 0.207 0.292 0.041 0.263 0.001 0.326 0.311 0.216 0.209 0.310 0.127 0.218 0.021 0.237 0.320 0.229 D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5 D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5 0.216 0.317 0.065 0.249 0.001 0.357 0.320 0.203 0.204 0.322 0.123 0.212 0.016 0.275 0.247 0.233 0.211 0.295 0.052 0.247 0.000 0.355 0.320 0.209 0.204 0.322 0.123 0.212 0.016 0.275 0.247 0.236 0.216 0.314 0.074 0.248 0.001 0.357 0.320 0.198 0.205 0.323 0.123 0.206 0.016 0.275 0.252 0.237 0.212 0.308 0.046 0.249 0.001 0.356 0.320 0.209 0.204 0.320 0.125 0.208 0.016 0.275 0.246 0.236 0.211 0.293 0.063 0.248 0.000 0.354 0.320 0.201 0.204 0.323 0.125 0.208 0.016 0.275 0.249 0.232 D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0 D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0 0.217 0.304 0.048 0.244 0.000 0.390 0.311 0.219 0.206 0.319 0.117 0.229 0.010 0.255 0.277 0.233 0.218 0.313 0.062 0.244 0.000 0.388 0.311 0.208 0.207 0.317 0.126 0.239 0.010 0.255 0.268 0.233 0.221 0.309 0.084 0.243 0.000 0.389 0.311 0.209 0.205 0.319 0.127 0.224 0.010 0.255 0.268 0.233 0.213 0.285 0.057 0.243 0.000 0.387 0.311 0.210 0.206 0.307 0.127 0.240 0.010 0.255 0.268 0.233 0.211 0.295 0.036 0.244 0.000 0.387 0.311 0.205 0.204 0.305 0.127 0.240 0.010 0.255 0.259 0.233 D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0 D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0 0.203 0.294 0.030 0.248 0.000 0.335 0.315 0.196 0.192 0.323 0.108 0.248 0.009 0.201 0.250 0.205 0.208 0.306 0.059 0.248 0.000 0.334 0.315 0.193 0.190 0.321 0.107 0.237 0.009 0.201 0.251 0.205 0.205 0.310 0.050 0.248 0.000 0.334 0.315 0.178 0.193 0.322 0.122 0.243 0.009 0.201 0.249 0.205 0.207 0.300 0.063 0.248 0.001 0.333 0.313 0.192 0.192 0.321 0.122 0.243 0.010 0.201 0.245 0.205 0.210 0.330 0.050 0.246 0.001 0.336 0.315 0.191 0.192 0.323 0.122 0.243 0.009 0.201 0.240 0.205
  • 30. IAIK K-Means Par K-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-MeansK-Means EMEMEMEMEMEMEMEMEMEMEM Total BW DI GL HS IO IR SE SO VE VO Total BW DI GL HS IO IR SE SO VE VO N NN σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 Raw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw DataRaw Data 0.299 0.734 0.052 0.335 0.254 0.121 0.708 0.608 0.006 0.113 0.057 Not availableNot availableNot availableNot availableNot availableNot availableNot availableNot availableNot availableNot availableNot available 0.307 0.735 0.030 0.388 0.019 0.123 0.705 0.529 0.008 0.188 0.342 0.346 0.718 0.103 0.370 0.289 0.254 0.806 0.621 0.005 0.103 0.194 Semantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic PatternsSemantic Patterns D=0.0 MDL=1.5D=0.0 MDL=1.5D=0.0 MDL=1.5D=0.0 MDL=1.5D=0.0 MDL=1.5D=0.0 MDL=1.5D=0.0 MDL=1.5D=0.0 MDL=1.5 D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0D=0.0 MDL=1.0 0.315 0.724 0.039 0.329 0.309 0.045 0.717 0.582 0.026 0.198 0.183 0.317 0.777 0.006 0.312 0.239 0.218 0.651 0.592 0.016 0.174 0.186 0.323 0.724 0.025 0.334 0.344 0.071 0.730 0.590 0.012 0.198 0.196 0.327 0.752 0.001 0.318 0.240 0.218 0.766 0.598 0.016 0.167 0.197 0.318 0.719 0.026 0.285 0.316 0.051 0.769 0.600 0.008 0.199 0.203 0.323 0.727 0.011 0.287 0.229 0.217 0.749 0.600 0.018 0.176 0.218 0.317 0.722 0.025 0.298 0.357 0.040 0.712 0.602 0.013 0.199 0.201 0.317 0.732 0.009 0.316 0.232 0.221 0.637 0.606 0.025 0.175 0.214 0.299 0.646 0.015 0.294 0.328 0.026 0.686 0.581 0.014 0.198 0.200 0.325 0.703 0.006 0.305 0.233 0.216 0.796 0.594 0.019 0.181 0.195 D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0D=0.5 MDL=1.0 D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0D=0.7 MDL=1.0 0.333 0.817 0.072 0.293 0.338 0.181 0.611 0.614 0.009 0.164 0.234 0.302 0.579 0.082 0.332 0.285 0.184 0.633 0.634 0.006 0.099 0.183 0.333 0.817 0.076 0.278 0.340 0.181 0.621 0.621 0.009 0.151 0.237 0.300 0.579 0.082 0.307 0.285 0.184 0.636 0.632 0.006 0.117 0.176 0.326 0.817 0.068 0.286 0.335 0.181 0.587 0.604 0.009 0.149 0.228 0.301 0.579 0.086 0.310 0.285 0.184 0.639 0.643 0.006 0.095 0.183 0.327 0.817 0.072 0.269 0.337 0.181 0.604 0.580 0.009 0.166 0.232 0.301 0.579 0.076 0.319 0.285 0.184 0.639 0.632 0.006 0.109 0.185 0.334 0.817 0.071 0.303 0.336 0.181 0.610 0.605 0.011 0.163 0.244 0.300 0.579 0.079 0.311 0.285 0.184 0.633 0.633 0.006 0.109 0.183 D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5D=0.5 MDL=1.5 D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5D=0.7 MDL=1.5 0.352 0.817 0.099 0.298 0.382 0.143 0.751 0.601 0.018 0.193 0.218 0.339 0.579 0.086 0.348 0.324 0.242 0.761 0.596 0.013 0.187 0.252 0.358 0.817 0.100 0.330 0.385 0.163 0.751 0.588 0.015 0.194 0.232 0.339 0.579 0.086 0.356 0.324 0.242 0.761 0.595 0.012 0.192 0.239 0.352 0.817 0.096 0.315 0.387 0.143 0.738 0.576 0.019 0.193 0.231 0.340 0.579 0.092 0.348 0.324 0.242 0.761 0.603 0.012 0.194 0.241 0.348 0.817 0.103 0.288 0.383 0.158 0.716 0.579 0.015 0.194 0.226 0.339 0.579 0.094 0.355 0.324 0.242 0.761 0.602 0.012 0.181 0.240 0.356 0.817 0.098 0.296 0.378 0.166 0.776 0.604 0.012 0.190 0.225 0.338 0.579 0.107 0.355 0.324 0.242 0.752 0.597 0.012 0.177 0.236 D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0D=0.5 MDL=2.0 D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0D=0.7 MDL=2.0 0.329 0.817 0.054 0.339 0.330 0.064 0.752 0.563 0.017 0.151 0.199 0.323 0.579 0.105 0.347 0.266 0.228 0.784 0.585 0.015 0.092 0.227 0.328 0.817 0.052 0.320 0.330 0.064 0.753 0.585 0.017 0.144 0.196 0.325 0.579 0.098 0.359 0.266 0.228 0.784 0.584 0.015 0.098 0.238 0.331 0.817 0.055 0.313 0.330 0.109 0.767 0.562 0.012 0.149 0.194 0.323 0.579 0.105 0.358 0.266 0.228 0.784 0.576 0.015 0.090 0.230 0.330 0.817 0.059 0.335 0.328 0.073 0.765 0.560 0.019 0.148 0.199 0.326 0.579 0.099 0.351 0.266 0.228 0.798 0.595 0.015 0.091 0.235 0.333 0.817 0.064 0.321 0.330 0.068 0.764 0.593 0.013 0.158 0.200 0.326 0.579 0.104 0.361 0.266 0.228 0.798 0.585 0.015 0.090 0.237 D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0D=0.5 MDL=3.0 D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0D=0.7 MDL=3.0 0.322 0.817 0.026 0.326 0.333 0.099 0.739 0.567 0.022 0.136 0.153 0.304 0.579 0.001 0.362 0.200 0.228 0.728 0.574 0.032 0.114 0.224 0.322 0.817 0.029 0.326 0.320 0.127 0.702 0.583 0.017 0.150 0.150 0.307 0.579 0.000 0.364 0.208 0.228 0.735 0.573 0.029 0.113 0.236 0.317 0.817 0.035 0.318 0.320 0.099 0.705 0.556 0.024 0.140 0.154 0.306 0.579 0.001 0.355 0.211 0.228 0.726 0.572 0.035 0.113 0.237 0.328 0.817 0.026 0.342 0.328 0.118 0.759 0.563 0.020 0.150 0.153 0.307 0.579 0.001 0.363 0.219 0.228 0.729 0.575 0.029 0.113 0.233 0.323 0.817 0.029 0.330 0.322 0.099 0.731 0.563 0.023 0.151 0.161 0.304 0.579 0.001 0.356 0.204 0.224 0.713 0.589 0.030 0.119 0.226
  • 31. IAIK Distance Data Missing EucEucEucEucEucEucEucEuc CosCosCosCosCosCosCosCos RawRawRawRaw Semantic PatternsSemantic PatternsSemantic PatternsSemantic Patterns RawRawRawRaw Semantic PatternsSemantic PatternsSemantic PatternsSemantic Patterns 0% 10% 50% 90% 0% 10% 50% 90% 0% 10% 50% 90% 0% 10% 50% 90% BC DE KR LY MU SO SP VO ZO Total AN CO CA CG HC HH HE Total BW DI GL HS IO IR SE SO VE VO Total CategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategorical 0.52 0.52 0.52 0.52 0.54 0.54 0.53 0.50 0.53 0.53 0.53 0.51 0.54 0.54 0.53 0.51 0.68 0.66 0.55 0.32 0.81 0.80 0.38 0.22 0.66 0.66 0.67 0.36 0.81 0.80 0.74 0.46 0.54 0.54 0.53 0.52 0.52 0.52 0.51 0.50 0.54 0.54 0.53 0.51 0.52 0.52 0.52 0.51 0.63 0.68 0.63 0.30 0.63 0.59 0.64 0.48 0.59 0.53 0.51 0.32 0.61 0.58 0.56 0.35 0.64 0.64 0.62 0.57 0.68 0.67 0.62 0.53 0.57 0.57 0.56 0.54 0.67 0.67 0.67 0.62 0.65 0.63 0.53 0.22 0.75 0.70 0.09 0.08 0.58 0.56 0.50 0.18 0.73 0.72 0.63 0.28 0.48 0.47 0.44 0.38 0.62 0.46 0.39 0.39 0.44 0.44 0.41 0.37 0.57 0.57 0.54 0.45 0.80 0.79 0.76 0.67 0.78 0.78 0.68 0.51 0.62 0.63 0.67 0.62 0.79 0.79 0.78 0.72 0.83 0.81 0.72 0.31 0.86 0.85 0.64 0.24 0.80 0.79 0.71 0.31 0.86 0.84 0.76 0.41 0.64 0.64 0.59 0.42 0.69 0.66 0.50 0.38 0.59 0.58 0.57 0.41 0.68 0.67 0.64 0.48 MixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixedMixed 0.64 0.63 0.55 0.38 0.66 0.67 0.51 0.38 0.44 0.46 0.50 0.38 0.66 0.66 0.61 0.42 0.59 0.59 0.56 0.51 0.59 0.58 0.52 0.50 0.50 0.50 0.51 0.51 0.62 0.62 0.60 0.57 0.62 0.61 0.59 0.54 0.65 0.65 0.60 0.52 0.55 0.55 0.54 0.51 0.65 0.64 0.63 0.57 0.52 0.52 0.52 0.50 0.52 0.53 0.54 0.53 0.51 0.51 0.52 0.51 0.52 0.52 0.52 0.52 0.86 0.86 0.85 0.81 0.87 0.87 0.85 0.81 0.81 0.81 0.82 0.81 0.87 0.87 0.86 0.84 0.87 0.86 0.85 0.82 0.87 0.87 0.83 0.80 0.84 0.84 0.83 0.81 0.88 0.88 0.87 0.83 0.59 0.58 0.56 0.50 0.64 0.64 0.58 0.55 0.52 0.51 0.55 0.52 0.65 0.65 0.64 0.57 0.67 0.67 0.64 0.58 0.69 0.69 0.63 0.58 0.60 0.60 0.61 0.58 0.69 0.69 0.68 0.62 NumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumericalNumerical 0.86 0.86 0.76 0.68 0.91 0.91 0.84 0.69 0.62 0.61 0.59 0.50 0.90 0.89 0.88 0.84 0.55 0.54 0.53 0.53 0.56 0.55 0.54 0.50 0.53 0.53 0.52 0.50 0.56 0.55 0.55 0.53 0.49 0.45 0.31 0.30 0.53 0.52 0.42 0.31 0.51 0.51 0.48 0.29 0.53 0.52 0.48 0.34 0.64 0.63 0.59 0.52 0.69 0.69 0.61 0.53 0.54 0.54 0.55 0.51 0.69 0.69 0.65 0.60 0.51 0.52 0.55 0.54 0.61 0.61 0.56 0.46 0.46 0.46 0.47 0.51 0.61 0.61 0.60 0.57 0.81 0.60 0.47 0.33 0.83 0.81 0.75 0.67 0.87 0.84 0.77 0.34 0.84 0.81 0.76 0.75 0.61 0.53 0.21 0.15 0.57 0.57 0.43 0.17 0.39 0.40 0.44 0.27 0.57 0.57 0.55 0.41 0.54 0.53 0.51 0.50 0.54 0.54 0.51 0.50 0.52 0.52 0.52 0.52 0.54 0.54 0.54 0.53 0.35 0.33 0.29 0.26 0.37 0.37 0.35 0.28 0.36 0.36 0.36 0.31 0.37 0.37 0.36 0.33 0.15 0.15 0.12 0.09 0.22 0.21 0.16 0.10 0.20 0.20 0.17 0.10 0.21 0.21 0.20 0.13 0.55 0.51 0.43 0.39 0.58 0.58 0.52 0.42 0.50 0.50 0.49 0.38 0.58 0.58 0.56 0.50
  • 32. IAIK Data set EUC (N) EUC (NN) COS (NN) EUC (NN) COS (NN) EUC (NN) COS (NN) BC DE KR LY MU SO SP VO ZO Total AN CO CA CG HC HH HE Total BW DI GL HS IO IR SE SO VE VO Total RAWRAWRAW BaselineBaseline Semantic PatternsSemantic Patterns CategoricalCategoricalCategoricalCategoricalCategoricalCategoricalCategorical 0.52 0.53 0.53 0.52 0.53 0.54 0.54 0.68 0.68 0.66 0.67 0.67 0.81 0.81 0.54 0.54 0.54 0.54 0.54 0.52 0.52 0.63 0.63 0.59 0.60 0.57 0.63 0.61 0.64 0.64 0.57 0.64 0.64 0.68 0.67 0.65 0.65 0.58 0.69 0.70 0.75 0.73 0.48 0.48 0.44 0.48 0.48 0.62 0.57 0.80 0.80 0.62 0.80 0.80 0.78 0.79 0.84 0.83 0.80 0.85 0.84 0.86 0.86 0.64 0.64 0.59 0.64 0.64 0.69 0.68 MixedMixedMixedMixedMixedMixedMixed 0.64 0.64 0.44 0.64 0.65 0.65 0.66 0.59 0.59 0.50 0.59 0.60 0.58 0.62 0.62 0.62 0.55 0.61 0.61 0.61 0.65 0.52 0.52 0.51 0.52 0.52 0.52 0.52 0.86 0.86 0.81 0.85 0.85 0.86 0.87 0.87 0.87 0.84 0.86 0.86 0.86 0.88 0.59 0.59 0.52 0.61 0.60 0.63 0.65 0.67 0.67 0.60 0.67 0.67 0.67 0.69 NumericalNumericalNumericalNumericalNumericalNumericalNumerical 0.86 0.86 0.62 0.74 0.74 0.89 0.90 0.55 0.55 0.53 0.54 0.54 0.55 0.56 0.49 0.49 0.51 0.51 0.51 0.53 0.53 0.64 0.64 0.54 0.63 0.63 0.66 0.69 0.51 0.51 0.46 0.55 0.55 0.63 0.61 0.81 0.81 0.87 0.73 0.73 0.81 0.83 0.61 0.61 0.39 0.54 0.54 0.57 0.57 0.54 0.54 0.52 0.54 0.54 0.54 0.54 0.35 0.35 0.36 0.37 0.37 0.36 0.37 0.15 0.15 0.20 0.21 0.21 0.22 0.21 0.55 0.55 0.50 0.54 0.54 0.58 0.58