SlideShare una empresa de Scribd logo
1 de 19
LAB SEMINAR
Nguyen Thanh Sang
Network Science Lab
Dept. of Artificial Intelligence
The Catholic University of Korea
E-mail: sang.ngt99@gmail.com
Are More Layers Beneficial to Graph Transformers?
--- Abdullah, Mariam, Jiayuan He, and Ke Wang ---
2023-05-05
Content
s
1
⮚ Paper
▪ Introduction
▪ Problem
▪ Contributions
▪ Framework
▪ Experiment
▪ Conclusion
2
Introduction
+ Graph transformer implies a global attention
mechanism to enable information passing between all
nodes
 learn the long-range dependency of the graph
structures.
 strong expressiveness, avoiding the inherent
limitations of encoding paradigms.
+ Global attention enables explicit focus on essential
parts among the nodes to model crucial substructures in
the graph.
+ Substructure is one of the basic intrinsic features of
graph data, which is widely used in both graph data
analysis and deep graph neural models.
3
Problems
❖ Graph transformer in current studies is usually shallow (< 12 layers).
❖ Not clear whether the capability of graph transformers in graph tasks can be strengthened
by increasing model depth.
❖ Difficult for deep models to autonomously learn effective attention patterns of substructures
and obtain expressive graph substructure features.
4
Contributions
• A bottleneck of graph transformers is constructed to deal with the depth limitation of current graph
transformers.
• Demonstrate the difficulty for deep models to learn effective attention patterns of substructures
and obtain informative graph substructure features.
• Proposed a simple yet effective local attention mechanism based on substructure tokens,
focusing on
+ local substructure features of deeper graph transformer and,
+ improving the expressiveness of learned representations.
• The proposed model help deal with the depth limitation of graph transformer and achieves state-
of-the-art results on standard graph benchmarks with deeper models.
5
Graph and substructure
• Graph: , nodes:
• Substructure:
=> a subset of the graph G and edges are all the existing edges in G between nodes subset, which is also
known as induced subgraph.
6
Transformer
• The self-attention:
• A complete transformer layer also contains a two-layer fully connected network FFN, and layer
normalization with residual connection LN is also applied to both self-attention module and fully-
connected network:
• The proposed model uses the distance and path-based relative position encoding methods. graph
distance DIS and shortest path information SP
7
Attention Capacity
• Graph transformers perform poorly when more layers are stacked raising doubts about the
effectiveness of these self-attention layers.
 require sufficient capacity to represent various substructures, which is necessary for learning
attention patterns.
• The capacity of attention as the maximum difference between representations learned by
different substructure attention patterns.
• A fixed attention pattern e of substructure. Attention patterns
• The attention space is spanned by these attention patterns.
• Attention from this space only focuses on the important substructures:
8
Attention Capacity with depth
• The stacked self-attention layers can cause decreased attention capacity for deeper layers.
• Then the attention capacity of layer l is upper bounded by
• After stacked transformer layers:
• The self-attention module decreases the upper bound of attention capacity exponentially.
9
Local Attention
• Reducing attention capacity with depth creates two problems for graph transformers:
+ difficulty attending to substructures,
+ loss of feature expressiveness to substructures.
• A substructure-based local attention as inductive bias into graph transformer is used to make
up for the deficiency of attention on substructures.
• A capacity decay can be alleviated if local attention to substructures is applied in each layer.
 Substructure based local attention:
 introduces the substructure based attention to encourage the model to focus on substructure
features.
 enlarges the attention capacity of deep layers to promote the representation capacity of the
model when it grows deep
10
Overview
• DeepGraph tokenizes the graph into node
level and substructure level.
• Local attention is applied to the
substructure token and corresponding
nodes.
• Global attention between nodes is still
preserved.
• Utilize attention to substructures in the
model while combining global attention, to
achieve a better global-local encoding
balance.
11
Substructures
• Sampling: there are 2 groups:
+ Neighbors containing local features, such as k-hop neighbors and random walk
neighbors
+ Geometric substructures representing graph topological features, such as circles,
paths, stars, and other special subgraphs
• Token encoding: permutation-based structure encoding. Depth-first search (DFS) with the
order of node degree is applied to reduce the possible order of substructure nodes.
• Local attention: M is the mark to determine the substructure.
=> This model combine local and global attentions.
12
Experiments
 Validate some research questions:
i. How does DeepGraph perform in comparison to existing transformers on popular
benchmarks?
ii. Does DeepGraph’s performance improve with increasing depth?
iii. Is DeepGraph capable of alleviating the problem of shrinking attention capacity?
iv. What is the impact of each part of the model on overall performance?
13
Dataset
• Tasks of the graph property prediction and node classification.
• Datasets:
+ PCQM4M-LSC is a large-scale graph-level regression dataset with more than 3.8M molecules.
+ ZINC consists of 12,000 graphs with a molecule property for regression.
+ CLUSTER and PATTERN are challenging node classification tasks with graph sizes varying
from dozens to hundreds, containing 14,000 and 12,000 graphs, respectively
14
Baseline comparison
• The depth of DeepGraph varies from 12 to 48 layers.
• The proposed model with 12-layer outperforms baseline models on PCQM4M-LSC,
ZINC, and PATTERN, especially on ZINC and PATTERN, surpassing the previous best
result reported significantly.
• As model depth increases, this model achieves consistent improvement.
15
Effect of deepening
• DeepGraph outperforms baselines by a significant margin.
• Naive deepening decreases performance on all the datasets for both baseline models.
 all the 4-time deep SAT fail to converge on CLUSTER due to the training difficulty, as well as
Graphormer with reattention.
• Fusion improves deeper Graphormer on PCQM4M-LSC, and reattention boosts 2x deep SAT on
CLUSTER slightly, but none of them are consistent.
16
Ablation and sensitivity study
• Without local attention, the model is only a relative position-based
graph transformer with deepnorm residual connections.
 performance decreases significantly on both datasets, especially
for deeper models,
 substructure encoding plays an instrumental role, emphasizing the
importance of structural information.
 deepnorm also contributes to model performance by stabilizing
the optimization, especially for 48-layer models
• Without substructure encoding, randomly initialized embedding is
applied to each substructure token.
• Geometric substructures are more critical for ZINC.
 specific structures like rings are important fundamental features
for the molecule data,
 it can be further strengthened when combined with neighbor
substructures.
17
Conclusions
• The bottleneck of graph transformers’ performance is proposed to improve Graph Transformer in depth
increases.
• Proposed a novel graph transformer model based on substructure-based local attention with additional
substructure tokens.
• Enhances the ability of the attention mechanism to focus on substructures and addresses the limitation of
encoding expressive features as the model deepens.
• Experiments show that this model breaks the depth bottleneck and achieves state-of-the-art performance
on popular graph benchmarks.
18
Thank you!

Más contenido relacionado

Similar a NS-CUK Seminar: S.T.Nguyen, Review on "Are More Layers Beneficial to Graph Transformers?", ICLR 2023

240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...
240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...
240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...thanhdowork
 
NS-CUK Seminar: H.E.Lee, Review on "Weisfeiler and Leman Go Neural: Higher-O...
NS-CUK Seminar: H.E.Lee,  Review on "Weisfeiler and Leman Go Neural: Higher-O...NS-CUK Seminar: H.E.Lee,  Review on "Weisfeiler and Leman Go Neural: Higher-O...
NS-CUK Seminar: H.E.Lee, Review on "Weisfeiler and Leman Go Neural: Higher-O...ssuser4b1f48
 
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...ssuser4b1f48
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...Universitat Politècnica de Catalunya
 
Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Yu Huang
 
NS - CUK Journal club: V.T.Hoang, Review on " Representing Long-Range Context...
NS - CUK Journal club: V.T.Hoang, Review on " Representing Long-Range Context...NS - CUK Journal club: V.T.Hoang, Review on " Representing Long-Range Context...
NS - CUK Journal club: V.T.Hoang, Review on " Representing Long-Range Context...ssuser4b1f48
 
NS-CUK Seminar: J.H.Lee, Review on "Rethinking the Expressive Power of GNNs...
NS-CUK Seminar:  J.H.Lee,  Review on "Rethinking the Expressive Power of GNNs...NS-CUK Seminar:  J.H.Lee,  Review on "Rethinking the Expressive Power of GNNs...
NS-CUK Seminar: J.H.Lee, Review on "Rethinking the Expressive Power of GNNs...ssuser4b1f48
 
NS-CUK Seminar: J.H.Lee, Review on "Rethinking the Expressive Power of GNNs ...
NS-CUK Seminar: J.H.Lee,  Review on "Rethinking the Expressive Power of GNNs ...NS-CUK Seminar: J.H.Lee,  Review on "Rethinking the Expressive Power of GNNs ...
NS-CUK Seminar: J.H.Lee, Review on "Rethinking the Expressive Power of GNNs ...ssuser4b1f48
 
Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSearch to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSungchul Kim
 
NS-CUK Seminar: S.T.Nguyen, Review on "Geom-GCN: Geometric Graph Convolutiona...
NS-CUK Seminar: S.T.Nguyen, Review on "Geom-GCN: Geometric Graph Convolutiona...NS-CUK Seminar: S.T.Nguyen, Review on "Geom-GCN: Geometric Graph Convolutiona...
NS-CUK Seminar: S.T.Nguyen, Review on "Geom-GCN: Geometric Graph Convolutiona...ssuser4b1f48
 
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...thanhdowork
 
NS-CUK Seminar: J.H.Lee, Review on "Relational Attention: Generalizing Trans...
NS-CUK Seminar: J.H.Lee,  Review on "Relational Attention: Generalizing Trans...NS-CUK Seminar: J.H.Lee,  Review on "Relational Attention: Generalizing Trans...
NS-CUK Seminar: J.H.Lee, Review on "Relational Attention: Generalizing Trans...ssuser4b1f48
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyNUPUR YADAV
 
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...ssuser4b1f48
 
NS-CUK Joint Journal Club: V.T.Hoang, Review on "NAGphormer: A Tokenized Grap...
NS-CUK Joint Journal Club: V.T.Hoang, Review on "NAGphormer: A Tokenized Grap...NS-CUK Joint Journal Club: V.T.Hoang, Review on "NAGphormer: A Tokenized Grap...
NS-CUK Joint Journal Club: V.T.Hoang, Review on "NAGphormer: A Tokenized Grap...ssuser4b1f48
 
node2vec: Scalable Feature Learning for Networks.pptx
node2vec: Scalable Feature Learning for Networks.pptxnode2vec: Scalable Feature Learning for Networks.pptx
node2vec: Scalable Feature Learning for Networks.pptxssuser2624f71
 
Roman Kyslyi: Великі мовні моделі: огляд, виклики та рішення
Roman Kyslyi: Великі мовні моделі: огляд, виклики та рішенняRoman Kyslyi: Великі мовні моделі: огляд, виклики та рішення
Roman Kyslyi: Великі мовні моделі: огляд, виклики та рішенняLviv Startup Club
 

Similar a NS-CUK Seminar: S.T.Nguyen, Review on "Are More Layers Beneficial to Graph Transformers?", ICLR 2023 (20)

Chapter 3.pptx
Chapter 3.pptxChapter 3.pptx
Chapter 3.pptx
 
240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...
240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...
240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...
 
NS-CUK Seminar: H.E.Lee, Review on "Weisfeiler and Leman Go Neural: Higher-O...
NS-CUK Seminar: H.E.Lee,  Review on "Weisfeiler and Leman Go Neural: Higher-O...NS-CUK Seminar: H.E.Lee,  Review on "Weisfeiler and Leman Go Neural: Higher-O...
NS-CUK Seminar: H.E.Lee, Review on "Weisfeiler and Leman Go Neural: Higher-O...
 
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
 
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
 
Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling
 
NS - CUK Journal club: V.T.Hoang, Review on " Representing Long-Range Context...
NS - CUK Journal club: V.T.Hoang, Review on " Representing Long-Range Context...NS - CUK Journal club: V.T.Hoang, Review on " Representing Long-Range Context...
NS - CUK Journal club: V.T.Hoang, Review on " Representing Long-Range Context...
 
NS-CUK Seminar: J.H.Lee, Review on "Rethinking the Expressive Power of GNNs...
NS-CUK Seminar:  J.H.Lee,  Review on "Rethinking the Expressive Power of GNNs...NS-CUK Seminar:  J.H.Lee,  Review on "Rethinking the Expressive Power of GNNs...
NS-CUK Seminar: J.H.Lee, Review on "Rethinking the Expressive Power of GNNs...
 
NS-CUK Seminar: J.H.Lee, Review on "Rethinking the Expressive Power of GNNs ...
NS-CUK Seminar: J.H.Lee,  Review on "Rethinking the Expressive Power of GNNs ...NS-CUK Seminar: J.H.Lee,  Review on "Rethinking the Expressive Power of GNNs ...
NS-CUK Seminar: J.H.Lee, Review on "Rethinking the Expressive Power of GNNs ...
 
Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSearch to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the Eyes
 
NS-CUK Seminar: S.T.Nguyen, Review on "Geom-GCN: Geometric Graph Convolutiona...
NS-CUK Seminar: S.T.Nguyen, Review on "Geom-GCN: Geometric Graph Convolutiona...NS-CUK Seminar: S.T.Nguyen, Review on "Geom-GCN: Geometric Graph Convolutiona...
NS-CUK Seminar: S.T.Nguyen, Review on "Geom-GCN: Geometric Graph Convolutiona...
 
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...
 
NS-CUK Seminar: J.H.Lee, Review on "Relational Attention: Generalizing Trans...
NS-CUK Seminar: J.H.Lee,  Review on "Relational Attention: Generalizing Trans...NS-CUK Seminar: J.H.Lee,  Review on "Relational Attention: Generalizing Trans...
NS-CUK Seminar: J.H.Lee, Review on "Relational Attention: Generalizing Trans...
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
 
NS-CUK Joint Journal Club: V.T.Hoang, Review on "NAGphormer: A Tokenized Grap...
NS-CUK Joint Journal Club: V.T.Hoang, Review on "NAGphormer: A Tokenized Grap...NS-CUK Joint Journal Club: V.T.Hoang, Review on "NAGphormer: A Tokenized Grap...
NS-CUK Joint Journal Club: V.T.Hoang, Review on "NAGphormer: A Tokenized Grap...
 
NS-CUK Seminar: V.T.Hoang, Review on "Are More Layers Beneficial to Graph Tra...
NS-CUK Seminar: V.T.Hoang, Review on "Are More Layers Beneficial to Graph Tra...NS-CUK Seminar: V.T.Hoang, Review on "Are More Layers Beneficial to Graph Tra...
NS-CUK Seminar: V.T.Hoang, Review on "Are More Layers Beneficial to Graph Tra...
 
node2vec: Scalable Feature Learning for Networks.pptx
node2vec: Scalable Feature Learning for Networks.pptxnode2vec: Scalable Feature Learning for Networks.pptx
node2vec: Scalable Feature Learning for Networks.pptx
 
Roman Kyslyi: Великі мовні моделі: огляд, виклики та рішення
Roman Kyslyi: Великі мовні моделі: огляд, виклики та рішенняRoman Kyslyi: Великі мовні моделі: огляд, виклики та рішення
Roman Kyslyi: Великі мовні моделі: огляд, виклики та рішення
 

Más de ssuser4b1f48

NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...
NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...
NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...ssuser4b1f48
 
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...ssuser4b1f48
 
NS-CUK Seminar: H.B.Kim, Review on "Cluster-GCN: An Efficient Algorithm for ...
NS-CUK Seminar: H.B.Kim,  Review on "Cluster-GCN: An Efficient Algorithm for ...NS-CUK Seminar: H.B.Kim,  Review on "Cluster-GCN: An Efficient Algorithm for ...
NS-CUK Seminar: H.B.Kim, Review on "Cluster-GCN: An Efficient Algorithm for ...ssuser4b1f48
 
NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...
NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...
NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...ssuser4b1f48
 
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...ssuser4b1f48
 
Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)
Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)
Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)ssuser4b1f48
 
Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...
Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...
Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...ssuser4b1f48
 
Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°
Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°
Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°ssuser4b1f48
 
Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)
Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)
Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)ssuser4b1f48
 
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...ssuser4b1f48
 
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar: H.E.Lee,  Review on "Gated Graph Sequence Neural Networks", I...NS-CUK Seminar: H.E.Lee,  Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...ssuser4b1f48
 
NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...
NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...
NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...ssuser4b1f48
 
NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...
NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...
NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...ssuser4b1f48
 
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
NS-CUK Seminar: H.E.Lee,  Review on "PTE: Predictive Text Embedding through L...NS-CUK Seminar: H.E.Lee,  Review on "PTE: Predictive Text Embedding through L...
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...ssuser4b1f48
 
NS-CUK Seminar: H.B.Kim, Review on "Inductive Representation Learning on Lar...
NS-CUK Seminar: H.B.Kim,  Review on "Inductive Representation Learning on Lar...NS-CUK Seminar: H.B.Kim,  Review on "Inductive Representation Learning on Lar...
NS-CUK Seminar: H.B.Kim, Review on "Inductive Representation Learning on Lar...ssuser4b1f48
 
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
NS-CUK Seminar: H.E.Lee,  Review on "PTE: Predictive Text Embedding through L...NS-CUK Seminar: H.E.Lee,  Review on "PTE: Predictive Text Embedding through L...
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...ssuser4b1f48
 
NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...
NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...
NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...ssuser4b1f48
 
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
NS-CUK Seminar: H.B.Kim,  Review on "metapath2vec: Scalable representation le...NS-CUK Seminar: H.B.Kim,  Review on "metapath2vec: Scalable representation le...
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...ssuser4b1f48
 
NS-CUK Seminar: V.T.Hoang, Review on "Namkyeong Lee, et al. Relational Self-...
NS-CUK Seminar:  V.T.Hoang, Review on "Namkyeong Lee, et al. Relational Self-...NS-CUK Seminar:  V.T.Hoang, Review on "Namkyeong Lee, et al. Relational Self-...
NS-CUK Seminar: V.T.Hoang, Review on "Namkyeong Lee, et al. Relational Self-...ssuser4b1f48
 
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation l...
NS-CUK Seminar:  H.B.Kim,  Review on "metapath2vec: Scalable representation l...NS-CUK Seminar:  H.B.Kim,  Review on "metapath2vec: Scalable representation l...
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation l...ssuser4b1f48
 

Más de ssuser4b1f48 (20)

NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...
NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...
NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...
 
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
 
NS-CUK Seminar: H.B.Kim, Review on "Cluster-GCN: An Efficient Algorithm for ...
NS-CUK Seminar: H.B.Kim,  Review on "Cluster-GCN: An Efficient Algorithm for ...NS-CUK Seminar: H.B.Kim,  Review on "Cluster-GCN: An Efficient Algorithm for ...
NS-CUK Seminar: H.B.Kim, Review on "Cluster-GCN: An Efficient Algorithm for ...
 
NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...
NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...
NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...
 
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
 
Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)
Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)
Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)
 
Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...
Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...
Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...
 
Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°
Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°
Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°
 
Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)
Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)
Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)
 
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
 
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar: H.E.Lee,  Review on "Gated Graph Sequence Neural Networks", I...NS-CUK Seminar: H.E.Lee,  Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
 
NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...
NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...
NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...
 
NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...
NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...
NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...
 
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
NS-CUK Seminar: H.E.Lee,  Review on "PTE: Predictive Text Embedding through L...NS-CUK Seminar: H.E.Lee,  Review on "PTE: Predictive Text Embedding through L...
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
 
NS-CUK Seminar: H.B.Kim, Review on "Inductive Representation Learning on Lar...
NS-CUK Seminar: H.B.Kim,  Review on "Inductive Representation Learning on Lar...NS-CUK Seminar: H.B.Kim,  Review on "Inductive Representation Learning on Lar...
NS-CUK Seminar: H.B.Kim, Review on "Inductive Representation Learning on Lar...
 
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
NS-CUK Seminar: H.E.Lee,  Review on "PTE: Predictive Text Embedding through L...NS-CUK Seminar: H.E.Lee,  Review on "PTE: Predictive Text Embedding through L...
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
 
NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...
NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...
NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...
 
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
NS-CUK Seminar: H.B.Kim,  Review on "metapath2vec: Scalable representation le...NS-CUK Seminar: H.B.Kim,  Review on "metapath2vec: Scalable representation le...
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
 
NS-CUK Seminar: V.T.Hoang, Review on "Namkyeong Lee, et al. Relational Self-...
NS-CUK Seminar:  V.T.Hoang, Review on "Namkyeong Lee, et al. Relational Self-...NS-CUK Seminar:  V.T.Hoang, Review on "Namkyeong Lee, et al. Relational Self-...
NS-CUK Seminar: V.T.Hoang, Review on "Namkyeong Lee, et al. Relational Self-...
 
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation l...
NS-CUK Seminar:  H.B.Kim,  Review on "metapath2vec: Scalable representation l...NS-CUK Seminar:  H.B.Kim,  Review on "metapath2vec: Scalable representation l...
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation l...
 

Último

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Último (20)

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

NS-CUK Seminar: S.T.Nguyen, Review on "Are More Layers Beneficial to Graph Transformers?", ICLR 2023

  • 1. LAB SEMINAR Nguyen Thanh Sang Network Science Lab Dept. of Artificial Intelligence The Catholic University of Korea E-mail: sang.ngt99@gmail.com Are More Layers Beneficial to Graph Transformers? --- Abdullah, Mariam, Jiayuan He, and Ke Wang --- 2023-05-05
  • 2. Content s 1 ⮚ Paper ▪ Introduction ▪ Problem ▪ Contributions ▪ Framework ▪ Experiment ▪ Conclusion
  • 3. 2 Introduction + Graph transformer implies a global attention mechanism to enable information passing between all nodes  learn the long-range dependency of the graph structures.  strong expressiveness, avoiding the inherent limitations of encoding paradigms. + Global attention enables explicit focus on essential parts among the nodes to model crucial substructures in the graph. + Substructure is one of the basic intrinsic features of graph data, which is widely used in both graph data analysis and deep graph neural models.
  • 4. 3 Problems ❖ Graph transformer in current studies is usually shallow (< 12 layers). ❖ Not clear whether the capability of graph transformers in graph tasks can be strengthened by increasing model depth. ❖ Difficult for deep models to autonomously learn effective attention patterns of substructures and obtain expressive graph substructure features.
  • 5. 4 Contributions • A bottleneck of graph transformers is constructed to deal with the depth limitation of current graph transformers. • Demonstrate the difficulty for deep models to learn effective attention patterns of substructures and obtain informative graph substructure features. • Proposed a simple yet effective local attention mechanism based on substructure tokens, focusing on + local substructure features of deeper graph transformer and, + improving the expressiveness of learned representations. • The proposed model help deal with the depth limitation of graph transformer and achieves state- of-the-art results on standard graph benchmarks with deeper models.
  • 6. 5 Graph and substructure • Graph: , nodes: • Substructure: => a subset of the graph G and edges are all the existing edges in G between nodes subset, which is also known as induced subgraph.
  • 7. 6 Transformer • The self-attention: • A complete transformer layer also contains a two-layer fully connected network FFN, and layer normalization with residual connection LN is also applied to both self-attention module and fully- connected network: • The proposed model uses the distance and path-based relative position encoding methods. graph distance DIS and shortest path information SP
  • 8. 7 Attention Capacity • Graph transformers perform poorly when more layers are stacked raising doubts about the effectiveness of these self-attention layers.  require sufficient capacity to represent various substructures, which is necessary for learning attention patterns. • The capacity of attention as the maximum difference between representations learned by different substructure attention patterns. • A fixed attention pattern e of substructure. Attention patterns • The attention space is spanned by these attention patterns. • Attention from this space only focuses on the important substructures:
  • 9. 8 Attention Capacity with depth • The stacked self-attention layers can cause decreased attention capacity for deeper layers. • Then the attention capacity of layer l is upper bounded by • After stacked transformer layers: • The self-attention module decreases the upper bound of attention capacity exponentially.
  • 10. 9 Local Attention • Reducing attention capacity with depth creates two problems for graph transformers: + difficulty attending to substructures, + loss of feature expressiveness to substructures. • A substructure-based local attention as inductive bias into graph transformer is used to make up for the deficiency of attention on substructures. • A capacity decay can be alleviated if local attention to substructures is applied in each layer.  Substructure based local attention:  introduces the substructure based attention to encourage the model to focus on substructure features.  enlarges the attention capacity of deep layers to promote the representation capacity of the model when it grows deep
  • 11. 10 Overview • DeepGraph tokenizes the graph into node level and substructure level. • Local attention is applied to the substructure token and corresponding nodes. • Global attention between nodes is still preserved. • Utilize attention to substructures in the model while combining global attention, to achieve a better global-local encoding balance.
  • 12. 11 Substructures • Sampling: there are 2 groups: + Neighbors containing local features, such as k-hop neighbors and random walk neighbors + Geometric substructures representing graph topological features, such as circles, paths, stars, and other special subgraphs • Token encoding: permutation-based structure encoding. Depth-first search (DFS) with the order of node degree is applied to reduce the possible order of substructure nodes. • Local attention: M is the mark to determine the substructure. => This model combine local and global attentions.
  • 13. 12 Experiments  Validate some research questions: i. How does DeepGraph perform in comparison to existing transformers on popular benchmarks? ii. Does DeepGraph’s performance improve with increasing depth? iii. Is DeepGraph capable of alleviating the problem of shrinking attention capacity? iv. What is the impact of each part of the model on overall performance?
  • 14. 13 Dataset • Tasks of the graph property prediction and node classification. • Datasets: + PCQM4M-LSC is a large-scale graph-level regression dataset with more than 3.8M molecules. + ZINC consists of 12,000 graphs with a molecule property for regression. + CLUSTER and PATTERN are challenging node classification tasks with graph sizes varying from dozens to hundreds, containing 14,000 and 12,000 graphs, respectively
  • 15. 14 Baseline comparison • The depth of DeepGraph varies from 12 to 48 layers. • The proposed model with 12-layer outperforms baseline models on PCQM4M-LSC, ZINC, and PATTERN, especially on ZINC and PATTERN, surpassing the previous best result reported significantly. • As model depth increases, this model achieves consistent improvement.
  • 16. 15 Effect of deepening • DeepGraph outperforms baselines by a significant margin. • Naive deepening decreases performance on all the datasets for both baseline models.  all the 4-time deep SAT fail to converge on CLUSTER due to the training difficulty, as well as Graphormer with reattention. • Fusion improves deeper Graphormer on PCQM4M-LSC, and reattention boosts 2x deep SAT on CLUSTER slightly, but none of them are consistent.
  • 17. 16 Ablation and sensitivity study • Without local attention, the model is only a relative position-based graph transformer with deepnorm residual connections.  performance decreases significantly on both datasets, especially for deeper models,  substructure encoding plays an instrumental role, emphasizing the importance of structural information.  deepnorm also contributes to model performance by stabilizing the optimization, especially for 48-layer models • Without substructure encoding, randomly initialized embedding is applied to each substructure token. • Geometric substructures are more critical for ZINC.  specific structures like rings are important fundamental features for the molecule data,  it can be further strengthened when combined with neighbor substructures.
  • 18. 17 Conclusions • The bottleneck of graph transformers’ performance is proposed to improve Graph Transformer in depth increases. • Proposed a novel graph transformer model based on substructure-based local attention with additional substructure tokens. • Enhances the ability of the attention mechanism to focus on substructures and addresses the limitation of encoding expressive features as the model deepens. • Experiments show that this model breaks the depth bottleneck and achieves state-of-the-art performance on popular graph benchmarks.