Score: 1

Semantic-Enhanced Cross-Modal Place Recognition for Robust Robot Localization

Published: September 16, 2025 | arXiv ID: 2509.13474v1

By: Yujia Lin, Nicholas Evans

Potential Business Impact:

Helps robots find their way without GPS.

Business Areas:

Image Recognition Data and Analytics, Software

Ensuring accurate localization of robots in environments without GPS capability is a challenging task. Visual Place Recognition (VPR) techniques can potentially achieve this goal, but existing RGB-based methods are sensitive to changes in illumination, weather, and other seasonal changes. Existing cross-modal localization methods leverage the geometric properties of RGB images and 3D LiDAR maps to reduce the sensitivity issues highlighted above. Currently, state-of-the-art methods struggle in complex scenes, fine-grained or high-resolution matching, and situations where changes can occur in viewpoint. In this work, we introduce a framework we call Semantic-Enhanced Cross-Modal Place Recognition (SCM-PR) that combines high-level semantics utilizing RGB images for robust localization in LiDAR maps. Our proposed method introduces: a VMamba backbone for feature extraction of RGB images; a Semantic-Aware Feature Fusion (SAFF) module for using both place descriptors and segmentation masks; LiDAR descriptors that incorporate both semantics and geometry; and a cross-modal semantic attention mechanism in NetVLAD to improve matching. Incorporating the semantic information also was instrumental in designing a Multi-View Semantic-Geometric Matching and a Semantic Consistency Loss, both in a contrastive learning framework. Our experimental work on the KITTI and KITTI-360 datasets show that SCM-PR achieves state-of-the-art performance compared to other cross-modal place recognition methods.

A Pseudo Global Fusion Paradigm-Based Cross-View Network for LiDAR-Based Place Recognition

CV and Pattern Recognition

Helps cars find their way without GPS.

12 Aug 2025 0

89%

DSFormer: A Dual-Scale Cross-Learning Transformer for Visual Place Recognition

CV and Pattern Recognition

Helps robots find their way in new places.

24 Jul 2025 2

89%

EmbodiedPlace: Learning Mixture-of-Features with Embodied Constraints for Visual Place Recognition

CV and Pattern Recognition

Helps robots remember where they've been.

16 Jun 2025 1

View PDF Login to Bookmark

Page Count

17 pages

Semantic-Enhanced Cross-Modal Place Recognition for Robust Robot Localization

Helps robots find their way without GPS.

Technical Abstract

A Pseudo Global Fusion Paradigm-Based Cross-View Network for LiDAR-Based Place Recognition

DSFormer: A Dual-Scale Cross-Learning Transformer for Visual Place Recognition

EmbodiedPlace: Learning Mixture-of-Features with Embodied Constraints for Visual Place Recognition