SocraticKG: Knowledge Graph Construction via QA-Driven Fact Extraction
By: Sanghyeok Choi , Woosang Jeon , Kyuseok Yang and more
Constructing Knowledge Graphs (KGs) from unstructured text provides a structured framework for knowledge representation and reasoning, yet current LLM-based approaches struggle with a fundamental trade-off: factual coverage often leads to relational fragmentation, while premature consolidation causes information loss. To address this, we propose SocraticKG, an automated KG construction method that introduces question-answer pairs as a structured intermediate representation to systematically unfold document-level semantics prior to triple extraction. By employing 5W1H-guided QA expansion, SocraticKG captures contextual dependencies and implicit relational links typically lost in direct KG extraction pipelines, providing explicit grounding in the source document that helps mitigate implicit reasoning errors. Evaluation on the MINE benchmark demonstrates that our approach effectively addresses the coverage-connectivity trade-off, achieving superior factual retention while maintaining high structural cohesion even as extracted knowledge volume substantially expands. These results highlight that QA-mediated semantic scaffolding plays a critical role in structuring semantics prior to KG extraction, enabling more coherent and reliable graph construction in subsequent stages.
Similar Papers
KGQuest: Template-Driven QA Generation from Knowledge Graphs with LLM-Based Refinement
Computation and Language
Creates smart questions and answers from facts.
Ontology-Based Knowledge Graph Framework for Industrial Standard Documents via Hierarchical and Propositional Structuring
Information Retrieval
Organizes complex rules into smart computer knowledge.
Knowledge Graph-extended Retrieval Augmented Generation for Question Answering
Machine Learning (CS)
AI answers questions better by using facts.