Learning with Dual-level Noisy Correspondence for Multi-modal Entity Alignment
By: Haobin Li , Yijie Lin , Peng Hu and more
Potential Business Impact:
Connects messy information from different sources.
Multi-modal entity alignment (MMEA) aims to identify equivalent entities across heterogeneous multi-modal knowledge graphs (MMKGs), where each entity is described by attributes from various modalities. Existing methods typically assume that both intra-entity and inter-graph correspondences are faultless, which is often violated in real-world MMKGs due to the reliance on expert annotations. In this paper, we reveal and study a highly practical yet under-explored problem in MMEA, termed Dual-level Noisy Correspondence (DNC). DNC refers to misalignments in both intra-entity (entity-attribute) and inter-graph (entity-entity and attribute-attribute) correspondences. To address the DNC problem, we propose a robust MMEA framework termed RULE. RULE first estimates the reliability of both intra-entity and inter-graph correspondences via a dedicated two-fold principle. Leveraging the estimated reliabilities, RULE mitigates the negative impact of intra-entity noise during attribute fusion and prevents overfitting to noisy inter-graph correspondences during inter-graph discrepancy elimination. Beyond the training-time designs, RULE further incorporates a correspondence reasoning module that uncovers the underlying attribute-attribute connection across graphs, guaranteeing more accurate equivalent entity identification. Extensive experiments on five benchmarks verify the effectiveness of our method against the DNC compared with seven state-of-the-art methods.The code is available at \href{https://github.com/XLearning-SCU/RULE}{XLearning-SCU/RULE}
Similar Papers
Mitigating Modality Bias in Multi-modal Entity Alignment from a Causal Perspective
Multimedia
Finds matching things even with bad pictures.
MCA: 2D-3D Retrieval with Noisy Labels via Multi-level Adaptive Correction and Alignment
CV and Pattern Recognition
Finds matching 3D objects from 2D pictures.
Complementarity-driven Representation Learning for Multi-modal Knowledge Graph Completion
Artificial Intelligence
Helps computers learn more from pictures and words.