A Survey of Large Language Models for Data Challenges in Graphs
By: Mengran Li , Pengyu Zhang , Wenbin Xing and more
Potential Business Impact:
Helps computers understand messy, changing information better.
Graphs are a widely used paradigm for representing non-Euclidean data, with applications ranging from social network analysis to biomolecular prediction. While graph learning has achieved remarkable progress, real-world graph data presents a number of challenges that significantly hinder the learning process. In this survey, we focus on four fundamental data-centric challenges: (1) Incompleteness, real-world graphs have missing nodes, edges, or attributes; (2) Imbalance, the distribution of the labels of nodes or edges and their structures for real-world graphs are highly skewed; (3) Cross-domain Heterogeneity, graphs from different domains exhibit incompatible feature spaces or structural patterns; and (4) Dynamic Instability, graphs evolve over time in unpredictable ways. Recently, Large Language Models (LLMs) offer the potential to tackle these challenges by leveraging rich semantic reasoning and external knowledge. This survey focuses on how LLMs can address four fundamental data-centric challenges in graph-structured data, thereby improving the effectiveness of graph learning. For each challenge, we review both traditional solutions and modern LLM-driven approaches, highlighting how LLMs contribute unique advantages. Finally, we discuss open research questions and promising future directions in this emerging interdisciplinary field. To support further exploration, we have curated a repository of recent advances on graph learning challenges: https://github.com/limengran98/Awesome-Literature-Graph-Learning-Challenges.
Similar Papers
Actions Speak Louder than Prompts: A Large-Scale Study of LLMs for Graph Inference
Computation and Language
Computers learn from connected information better.
Each Graph is a New Language: Graph Learning with LLMs
Computation and Language
Teaches computers to understand connections in data.
Large Language Models for Knowledge Graph Embedding: A Survey
Computation and Language
Helps computers understand and connect information better.