Score: 2

Enhanced Pre-training of Graph Neural Networks for Million-Scale Heterogeneous Graphs

Published: October 14, 2025 | arXiv ID: 2510.12401v1

By: Shengyin Sun, Chen Ma, Jiehao Chen

Potential Business Impact:

Teaches computers to learn from messy, mixed-up data.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

In recent years, graph neural networks (GNNs) have facilitated the development of graph data mining. However, training GNNs requires sufficient labeled task-specific data, which is expensive and sometimes unavailable. To be less dependent on labeled data, recent studies propose to pre-train GNNs in a self-supervised manner and then apply the pre-trained GNNs to downstream tasks with limited labeled data. However, most existing methods are designed solely for homogeneous graphs (real-world graphs are mostly heterogeneous) and do not consider semantic mismatch (the semantic difference between the original data and the ideal data containing more transferable semantic information). In this paper, we propose an effective framework to pre-train GNNs on the large-scale heterogeneous graph. We first design a structure-aware pre-training task, which aims to capture structural properties in heterogeneous graphs. Then, we design a semantic-aware pre-training task to tackle the mismatch. Specifically, we construct a perturbation subspace composed of semantic neighbors to help deal with the semantic mismatch. Semantic neighbors make the model focus more on the general knowledge in the semantic space, which in turn assists the model in learning knowledge with better transferability. Finally, extensive experiments are conducted on real-world large-scale heterogeneous graphs to demonstrate the superiority of the proposed method over state-of-the-art baselines. Code available at https://github.com/sunshy-1/PHE.

Pre-training Graph Neural Networks with Structural Fingerprints for Materials Discovery

Materials Science

Teaches computers to understand materials faster.

3 Mar 2025 0

88%

Adaptive Heterogeneous Graph Neural Networks: Bridging Heterophily and Heterogeneity

Machine Learning (CS)

Helps computers understand messy, connected information better.

8 Aug 2025 0

88%

Generalize across Homophily and Heterophily: Hybrid Spectral Graph Pre-Training and Prompt Tuning

Machine Learning (CS)

Helps computers learn from messy, mixed-up data.

15 Aug 2025 0

View PDF Login to Bookmark

Country of Origin

🇭🇰 Hong Kong

Repos / Data Links

github.com

Page Count

26 pages

Enhanced Pre-training of Graph Neural Networks for Million-Scale Heterogeneous Graphs

Teaches computers to learn from messy, mixed-up data.

Technical Abstract

Pre-training Graph Neural Networks with Structural Fingerprints for Materials Discovery

Adaptive Heterogeneous Graph Neural Networks: Bridging Heterophily and Heterogeneity

Generalize across Homophily and Heterophily: Hybrid Spectral Graph Pre-Training and Prompt Tuning