Score: 0

UGG-ReID: Uncertainty-Guided Graph Model for Multi-Modal Object Re-Identification

Published: July 7, 2025 | arXiv ID: 2507.04638v2

By: Xixi Wan , Aihua Zheng , Bo Jiang and more

Potential Business Impact:

Finds lost things even with bad pictures.

Business Areas:
Image Recognition Data and Analytics, Software

Multi-modal object Re-IDentification (ReID) has gained considerable attention with the goal of retrieving specific targets across cameras using heterogeneous visual data sources. Existing methods primarily aim to improve identification performance, but often overlook the uncertainty arising from inherent defects, such as intra-modal noise and inter-modal conflicts. This uncertainty is particularly significant in the case of fine-grained local occlusion and frame loss, which becomes a challenge in multi-modal learning. To address the above challenge, we propose a robust approach named Uncertainty-Guided Graph model for multi-modal object ReID (UGG-ReID). UGG-ReID is designed to mitigate noise interference and facilitate effective multi-modal fusion by estimating both local and sample-level aleatoric uncertainty and explicitly modeling their dependencies. Specifically, we first propose the Gaussian patch-graph representation model that leverages uncertainty to quantify fine-grained local cues and capture their structural relationships. This process boosts the expressiveness of modal-specific information, ensuring that the generated embeddings are both more informative and robust. Subsequently, we design an uncertainty-guided mixture of experts strategy that dynamically routes samples to experts exhibiting low uncertainty. This strategy effectively suppresses noise-induced instability, leading to enhanced robustness. Meanwhile, we design an uncertainty-guided routing to strengthen the multi-modal interaction, improving the performance. UGG-ReID is comprehensively evaluated on five representative multi-modal object ReID datasets, encompassing diverse spectral modalities. Experimental results show that the proposed method achieves excellent performance on all datasets and is significantly better than current methods in terms of noise immunity. Our code will be made public upon acceptance.

Page Count
15 pages

Category
Computer Science:
CV and Pattern Recognition