MOS: Mitigating Optical-SAR Modality Gap for Cross-Modal Ship Re-Identification
By: Yujian Zhao, Hankun Liu, Guanglin Niu
Potential Business Impact:
Helps cameras and radar find the same ship.
Cross-modal ship re-identification (ReID) between optical and synthetic aperture radar (SAR) imagery has recently emerged as a critical yet underexplored task in maritime intelligence and surveillance. However, the substantial modality gap between optical and SAR images poses a major challenge for robust identification. To address this issue, we propose MOS, a novel framework designed to mitigate the optical-SAR modality gap and achieve modality-consistent feature learning for optical-SAR cross-modal ship ReID. MOS consists of two core components: (1) Modality-Consistent Representation Learning (MCRL) applies denoise SAR image procession and a class-wise modality alignment loss to align intra-identity feature distributions across modalities. (2) Cross-modal Data Generation and Feature fusion (CDGF) leverages a brownian bridge diffusion model to synthesize cross-modal samples, which are subsequently fused with original features during inference to enhance alignment and discriminability. Extensive experiments on the HOSS ReID dataset demonstrate that MOS significantly surpasses state-of-the-art methods across all evaluation protocols, achieving notable improvements of +3.0%, +6.2%, and +16.4% in R1 accuracy under the ALL to ALL, Optical to SAR, and SAR to Optical settings, respectively. The code and trained models will be released upon publication.
Similar Papers
Semi-supervised Multiscale Matching for SAR-Optical Image
CV and Pattern Recognition
Matches satellite pictures without needing manual labels.
Modality-Transition Representation Learning for Visible-Infrared Person Re-Identification
CV and Pattern Recognition
Helps cameras find people in dark or bright light.
Learning Representation and Synergy Invariances: A Povable Framework for Generalized Multimodal Face Anti-Spoofing
CV and Pattern Recognition
Keeps fake faces from fooling face scanners.