Exploring the Global-to-Local Attention Scheme in Graph Transformers: An Empirical Study
By: Zhengwei Wang, Gang Wu
Potential Business Impact:
Helps computers understand complex connections better.
Graph Transformers (GTs) show considerable potential in graph representation learning. The architecture of GTs typically integrates Graph Neural Networks (GNNs) with global attention mechanisms either in parallel or as a precursor to attention mechanisms, yielding a local-and-global or local-to-global attention scheme. However, as the global attention mechanism primarily captures long-range dependencies between nodes, these integration schemes may suffer from information loss, where the local neighborhood information learned by GNN could be diluted by the attention mechanism. Therefore, we propose G2LFormer, featuring a novel global-to-local attention scheme where the shallow network layers use attention mechanisms to capture global information, while the deeper layers employ GNN modules to learn local structural information, thereby preventing nodes from ignoring their immediate neighbors. An effective cross-layer information fusion strategy is introduced to allow local layers to retain beneficial information from global layers and alleviate information loss, with acceptable trade-offs in scalability. To validate the feasibility of the global-to-local attention scheme, we compare G2LFormer with state-of-the-art linear GTs and GNNs on node-level and graph-level tasks. The results indicate that G2LFormer exhibits excellent performance while keeping linear complexity.
Similar Papers
When Does Global Attention Help? A Unified Empirical Study on Atomistic Graph Learning
Machine Learning (CS)
Helps computers predict material properties faster.
Attention Beyond Neighborhoods: Reviving Transformer for Graph Clustering
Machine Learning (CS)
Helps computers group similar things by looking at connections.
Unifying and Enhancing Graph Transformers via a Hierarchical Mask Framework
CV and Pattern Recognition
Helps computers understand complex connections better.