Score: 3

Query-Specific GNN: A Comprehensive Graph Representation Learning Method for Retrieval Augmented Generation

Published: October 13, 2025 | arXiv ID: 2510.11541v1

By: Yuchen Yan , Zhihua Liu , Hao Wang and more

BigTech Affiliations: Xiaomi Samsung

Potential Business Impact:

Helps AI answer harder questions by finding more facts.

Business Areas:
Semantic Search Internet Services

Retrieval-augmented generation (RAG) has demonstrated its ability to enhance Large Language Models (LLMs) by integrating external knowledge sources. However, multi-hop questions, which require the identification of multiple knowledge targets to form a synthesized answer, raise new challenges for RAG systems. Under the multi-hop settings, existing methods often struggle to fully understand the questions with complex semantic structures and are susceptible to irrelevant noise during the retrieval of multiple information targets. To address these limitations, we propose a novel graph representation learning framework for multi-hop question retrieval. We first introduce a Multi-information Level Knowledge Graph (Multi-L KG) to model various information levels for a more comprehensive understanding of multi-hop questions. Based on this, we design a Query-Specific Graph Neural Network (QSGNN) for representation learning on the Multi-L KG. QSGNN employs intra/inter-level message passing mechanisms, and in each message passing the information aggregation is guided by the query, which not only facilitates multi-granular information aggregation but also significantly reduces the impact of noise. To enhance its ability to learn robust representations, we further propose two synthesized data generation strategies for pre-training the QSGNN. Extensive experimental results demonstrate the effectiveness of our framework in multi-hop scenarios, especially in high-hop questions the improvement can reach 33.8\%. The code is available at: https://github.com/Jerry2398/QSGNN.

Country of Origin
🇨🇳 🇰🇷 South Korea, China

Repos / Data Links

Page Count
25 pages

Category
Computer Science:
Machine Learning (CS)