2. 2 Learning Convolutional Neural Networks for Graphs
Representation Learning for Graphs
Telecom Safety Transportation Industry Smart cities
(Edge) deployment
Deep Learning System
Intermediate Representation: Graphs
?
3. 3 Learning Convolutional Neural Networks for Graphs
Problem Definition
▌ Input: Finite collection of graphs
● Nodes of any two graphs are not necessarily in correspondence
● Nodes and edges may have attributes (discrete and continuous)
▌ Problem: Learn a representation for classification/regression
▌ Example: Graph classification problem
…
[ ] = ?class
4. 4 Learning Convolutional Neural Networks for Graphs
State of the Art: Graph Kernels
▌ Define kernel based on substructures
● Shortest paths
● Random walks
● Subtrees
● ...
▌ Kernel is similarity function on pairs of graphs
● Count the number of common substructures
▌ Use graph kernels with SVMs
19. 19 Learning Convolutional Neural Networks for Graphs
Patchy: Learning CNNs for Graphs
neighborhood normalization
● normalized neighborhoods serve as receptive fields
● node and edge attributes correspond to channels
Convolutional architecture
1
2
3
4
1 2 3 4
a
1…
a
m
1 2 3 4
a
1…
a
n
1
2
3
4
1 2 3 4
a
1…
a
m
1 2 3 4
a
1…
a
n
1
2
3
4
1 2 3 4
a
1…
a
m
1 2 3 4
a
1…
a
n
…
1
2
3
4
1
2
3
4
1
23
4
1 2
3
4
1
2
3
4
12 3
4
20. 20 Learning Convolutional Neural Networks for Graphs
Node Sequence Selection
▌ We use centrality measures to generate the node sequences
▌ Nodes with similar structural roles are aligned across graphs
node sequence selection
A: Betweenness centrality B: Closeness
centrality
C: Eigenvector centrality D: Degree centrality
21. 21 Learning Convolutional Neural Networks for Graphs
▌ Simple breadth-first expansion until at least k nodes added,
or no additional nodes to add
Neighborhood Assembly
neighborhood assembly
22. 22 Learning Convolutional Neural Networks for Graphs
▌ Nodes of any two graphs should have similar position in the
adjacency matrices iff their structural roles are similar
▌ Result: For several distance measure pairs
it is possible to efficiently compare labeling
methods without supervision
▌ Example: ||A – A’||1
and edit distance
on graphs
Graph Normalization Problem
1
2
3
4
normalization
distance measures in
matrix and graph space
adjacency matrices under labeling
labeling
method
(centrality etc.)
26. 26 Learning Convolutional Neural Networks for Graphs
Computational Complexity
▌ At most linear in number of input graphs
▌ At most quadratic in number of nodes for each graph
(depends on maximal node degree and centrality measure)
27. 27 Learning Convolutional Neural Networks for Graphs
Convolutional Architecture
1 2 3 4
a
1…
a
n
1 2 3 4
a
1…
a
n
1 2 3 4
a
1…
a
n
1 2 3 4
a
1…
a
n
1 2 3 4
a
1…
a
n
…
v
1…
v
M
field size: 4, stride: 4, filters: M
28. 28 Learning Convolutional Neural Networks for Graphs
Convolutional Architecture
a
1…
a
n
a
1…
a
n
a
1…
a
n
a
1…
a
n
a
1…
a
n
…
v
1…
v
M
field size: 4, stride: 4, filters: M
1 2 3 4 1 2 3 4 1 2 3 41 2 3 4 1 2 3 4
29. 29 Learning Convolutional Neural Networks for Graphs
Convolutional Architecture
a
1…
a
n
a
1…
a
n
a
1…
a
n
a
1…
a
n
a
1…
a
n
…
v
1…
v
M
field size: 4, stride: 4, filters: M
v
1…
v
M
…
1 2 3 4 1 2 3 4 1 2 3 41 2 3 4 1 2 3 4
30. 30 Learning Convolutional Neural Networks for Graphs
Convolutional Architecture
a
1…
a
n
a
1…
a
n
a
1…
a
n
a
1…
a
n
a
1…
a
n
…
field size: 3, stride: 1, N filters
v
1…
v
N
field size: 4, stride: 4, filters: M
v
1…
v
M
v
1…
v
M
…
1 2 3 4 1 2 3 4 1 2 3 41 2 3 4 1 2 3 4
31. 31 Learning Convolutional Neural Networks for Graphs
Convolutional Architecture
a
1…
a
n
a
1…
a
n
a
1…
a
n
a
1…
a
n
a
1…
a
n
…
field size: 3, stride: 1, N filters
v
1…
v
N
field size: 4, stride: 4, filters: Mv
1…
v
M
v
1…
v
M
…
1 2 3 4 1 2 3 4 1 2 3 41 2 3 4 1 2 3 4
32. 32 Learning Convolutional Neural Networks for Graphs
Convolutional Architecture
a
1…
a
n
a
1…
a
n
a
1…
a
n
a
1…
a
n
a
1…
a
n
…
field size: 3, stride: 1, N filters
field size: 4, stride: 4, filters: Mv
1…
v
M
v
1…
v
N
v
1…
v
N
v
1…
v
M
…
…
1 2 3 4 1 2 3 4 1 2 3 41 2 3 4 1 2 3 4
33. 33 Learning Convolutional Neural Networks for Graphs
Experiments - Graph Classification
▌ Finite collection of graphs and their class labels
● Nodes of any two graphs are not necessarily in correspondence
● Nodes and edges may have attributes (discrete and continuous)
▌ Learn a function from graphs to class labels
…
[ ] = ?class
class = 1 class = 0 class = 1
34. 34 Learning Convolutional Neural Networks for Graphs
Experiments - Convolutional Architecture
1 2 … 10
a
1…
a
n
1 2 … 10
a
1…
a
n
1 2 … 10
a
1…
a
n
1 2 … 10
a
1…
a
n
1 2 … 10
a
1…
a
n
…
field size: k, stride: k, filters: 16
field size: 10, stride: 1, filters: 8
flatten, dense 128 units
softmax
1
…
16
1
…
16
…
1
…
8
1
…
8
…
35. 35 Learning Convolutional Neural Networks for Graphs
Classification Datasets
▌ MUTAG: Nitro compounds where classes indicate mutagenic
effect on a bacterium (Salmonella Typhimurium)
▌ PTC: Chemical compounds where classes indicate
carcinogenicity for male and female rats
▌ NCI: Chemical compounds where classes indicate activity
against non-small cell lung cancer and ovarian cancer cell lines
▌ D&D: Protein structures where classes indicate whether
structure is an enzyme or not
▌ ...
36. 36 Learning Convolutional Neural Networks for Graphs
▌ Q: How efficient and effective compared to graph kernels?
▌ Apply Patchy to typical graph classification benchmark data
Experiments - Graph Classification
37. 37 Learning Convolutional Neural Networks for Graphs
Experiments - Visualization
▌ Q: What do learned edge filters look like?
▌ Restricted Boltzmann machine applied to graphs
▌ Receptive field size of hidden layer: 9
small instances of graphs weights of hidden nodes
graphs sampled from RBM
38. 38 Learning Convolutional Neural Networks for Graphs
Discussion
▌ Pros:
● Graph kernel design not required
● Outperforms graph kernels on several datasets (speed and accuracy)
● Incorporates node and edge features (discrete and continuous)
● Supports visualizations (graph motifs, etc.)
▌ Cons:
● Prone to overfitting on smaller data sets (graph kernel benchmarks)
● Shift from designing graph kernels to tuning hyperparameters
● Graph normalization not part of learning
code to be released: patchy.neclab.eu