Score: 0

An Effective Approach for Node Classification in Textual Graphs

Published: August 7, 2025 | arXiv ID: 2508.05836v1

By: Rituparna Datta, Nibir Chandra Mandal

Potential Business Impact:

Helps computers understand research papers better.

Textual Attribute Graphs (TAGs) are critical for modeling complex networks like citation networks, but effective node classification remains challenging due to difficulties in integrating rich semantics from text with structural graph information. Existing methods often struggle with capturing nuanced domain-specific terminology, modeling long-range dependencies, adapting to temporal evolution, and scaling to massive datasets. To address these issues, we propose a novel framework that integrates TAPE (Text-Attributed Graph Representation Enhancement) with Graphormer. Our approach leverages a large language model (LLM), specifically ChatGPT, within the TAPE framework to generate semantically rich explanations from paper content, which are then fused into enhanced node representations. These embeddings are combined with structural features using a novel integration layer with learned attention weights. Graphormer's path-aware position encoding and multi-head attention mechanisms are employed to effectively capture long-range dependencies across the citation network. We demonstrate the efficacy of our framework on the challenging ogbn-arxiv dataset, achieving state-of-the-art performance with a classification accuracy of 0.772, significantly surpassing the best GCN baseline of 0.713. Our method also yields strong results in precision (0.671), recall (0.577), and F1-score (0.610). We validate our approach through comprehensive ablation studies that quantify the contribution of each component, demonstrating the synergy between semantic and structural information. Our framework provides a scalable and robust solution for node classification in dynamic TAGs, offering a promising direction for future research in knowledge systems and scientific discovery.

Country of Origin
🇺🇸 United States

Page Count
6 pages

Category
Computer Science:
Machine Learning (CS)