ParaFormer: A Generalized PageRank Graph Transformer for Graph Representation Learning
By: Chaohao Yuan , Zhenjie Song , Ercan Engin Kuruoglu and more
Graph Transformers (GTs) have emerged as a promising graph learning tool, leveraging their all-pair connected property to effectively capture global information. To address the over-smoothing problem in deep GNNs, global attention was initially introduced, eliminating the necessity for using deep GNNs. However, through empirical and theoretical analysis, we verify that the introduced global attention exhibits severe over-smoothing, causing node representations to become indistinguishable due to its inherent low-pass filtering. This effect is even stronger than that observed in GNNs. To mitigate this, we propose PageRank Transformer (ParaFormer), which features a PageRank-enhanced attention module designed to mimic the behavior of deep Transformers. We theoretically and empirically demonstrate that ParaFormer mitigates over-smoothing by functioning as an adaptive-pass filter. Experiments show that ParaFormer achieves consistent performance improvements across both node classification and graph classification tasks on 11 datasets ranging from thousands to millions of nodes, validating its efficacy. The supplementary material, including code and appendix, can be found in https://github.com/chaohaoyuan/ParaFormer.
Similar Papers
ParaFormer: Shallow Parallel Transformers with Progressive Approximation
Machine Learning (CS)
Makes AI models faster and smaller.
Exploring the Global-to-Local Attention Scheme in Graph Transformers: An Empirical Study
Machine Learning (CS)
Helps computers understand complex connections better.
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
CV and Pattern Recognition
Helps computers see scenes like humans do.