Enhancing Document Retrieval for Curating N-ary Relations in Knowledge Bases
By: Xing David Wang, Ulf Leser
Potential Business Impact:
Finds medical facts to build knowledge faster.
Curation of biomedical knowledge bases (KBs) relies on extracting accurate multi-entity relational facts from the literature - a process that remains largely manual and expert-driven. An essential step in this workflow is retrieving documents that can support or complete partially observed n-ary relations. We present a neural retrieval model designed to assist KB curation by identifying documents that help fill in missing relation arguments and provide relevant contextual evidence. To reduce dependence on scarce gold-standard training data, we exploit existing KB records to construct weakly supervised training sets. Our approach introduces two key technical contributions: (i) a layered contrastive loss that enables learning from noisy and incomplete relational structures, and (ii) a balanced sampling strategy that generates high-quality negatives from diverse KB records. On two biomedical retrieval benchmarks, our approach achieves state-of-the-art performance, outperforming strong baselines in NDCG@10 by 5.7 and 3.7 percentage points, respectively.
Similar Papers
Entity-Augmented Neuroscience Knowledge Retrieval Using Ontology and Semantic Understanding Capability of LLM
Computation and Language
Finds hidden brain science facts in papers.
BiCA: Effective Biomedical Dense Retrieval with Citation-Aware Hard Negatives
Information Retrieval
Helps computers find science papers better.
Two-dimensional Taxonomy for N-ary Knowledge Representation Learning Methods
Machine Learning (CS)
Maps complex relationships better than simple links.