Score: 2

Attention Beyond Neighborhoods: Reviving Transformer for Graph Clustering

Published: September 18, 2025 | arXiv ID: 2509.15024v1

By: Xuanting Xie , Bingheng Li , Erlin Pan and more

BigTech Affiliations: Alibaba

Potential Business Impact:

Helps computers group similar things by looking at connections.

Business Areas:

Power Grid Energy

Attention mechanisms have become a cornerstone in modern neural networks, driving breakthroughs across diverse domains. However, their application to graph structured data, where capturing topological connections is essential, remains underexplored and underperforming compared to Graph Neural Networks (GNNs), particularly in the graph clustering task. GNN tends to overemphasize neighborhood aggregation, leading to a homogenization of node representations. Conversely, Transformer tends to over globalize, highlighting distant nodes at the expense of meaningful local patterns. This dichotomy raises a key question: Is attention inherently redundant for unsupervised graph learning? To address this, we conduct a comprehensive empirical analysis, uncovering the complementary weaknesses of GNN and Transformer in graph clustering. Motivated by these insights, we propose the Attentive Graph Clustering Network (AGCN) a novel architecture that reinterprets the notion that graph is attention. AGCN directly embeds the attention mechanism into the graph structure, enabling effective global information extraction while maintaining sensitivity to local topological cues. Our framework incorporates theoretical analysis to contrast AGCN behavior with GNN and Transformer and introduces two innovations: (1) a KV cache mechanism to improve computational efficiency, and (2) a pairwise margin contrastive loss to boost the discriminative capacity of the attention space. Extensive experimental results demonstrate that AGCN outperforms state-of-the-art methods.

When Does Global Attention Help? A Unified Empirical Study on Atomistic Graph Learning

Machine Learning (CS)

Helps computers predict material properties faster.

7 Oct 2025 0

89%

Exploring the Global-to-Local Attention Scheme in Graph Transformers: An Empirical Study

Machine Learning (CS)

Helps computers understand complex connections better.

18 Sep 2025 1

88%

Topologic Attention Networks: Attending to Direct and Indirect Neighbors through Gaussian Belief Propagation

Machine Learning (CS)

Lets computers understand complex connections faster.

21 Nov 2025 1

View PDF Login to Bookmark

Country of Origin

🇺🇸 🇨🇳 United States, China

Page Count

9 pages

Attention Beyond Neighborhoods: Reviving Transformer for Graph Clustering

Helps computers group similar things by looking at connections.

Technical Abstract

When Does Global Attention Help? A Unified Empirical Study on Atomistic Graph Learning

Exploring the Global-to-Local Attention Scheme in Graph Transformers: An Empirical Study

Topologic Attention Networks: Attending to Direct and Indirect Neighbors through Gaussian Belief Propagation