Multi-modal Loop Closure Detection with Foundation Models in Severely Unstructured Environments
By: Laura Alejandra Encinar Gonzalez , John Folkesson , Rudolph Triebel and more
Potential Business Impact:
Helps robots know where they are in new places.
Robust loop closure detection is a critical component of Simultaneous Localization and Mapping (SLAM) algorithms in GNSS-denied environments, such as in the context of planetary exploration. In these settings, visual place recognition often fails due to aliasing and weak textures, while LiDAR-based methods suffer from sparsity and ambiguity. This paper presents MPRF, a multimodal pipeline that leverages transformer-based foundation models for both vision and LiDAR modalities to achieve robust loop closure in severely unstructured environments. Unlike prior work limited to retrieval, MPRF integrates a two-stage visual retrieval strategy with explicit 6-DoF pose estimation, combining DINOv2 features with SALAD aggregation for efficient candidate screening and SONATA-based LiDAR descriptors for geometric verification. Experiments on the S3LI dataset and S3LI Vulcano dataset show that MPRF outperforms state-of-the-art retrieval methods in precision while enhancing pose estimation robustness in low-texture regions. By providing interpretable correspondences suitable for SLAM back-ends, MPRF achieves a favorable trade-off between accuracy, efficiency, and reliability, demonstrating the potential of foundation models to unify place recognition and pose estimation. Code and models will be released at github.com/DLR-RM/MPRF.
Similar Papers
Multi-Mapcher: Loop Closure Detection-Free Heterogeneous LiDAR Multi-Session SLAM Leveraging Outlier-Robust Registration for Autonomous Vehicles
Robotics
Lets robots build better maps from different cameras.
A Pseudo Global Fusion Paradigm-Based Cross-View Network for LiDAR-Based Place Recognition
CV and Pattern Recognition
Helps cars find their way without GPS.
IRDFusion: Iterative Relation-Map Difference guided Feature Fusion for Multispectral Object Detection
CV and Pattern Recognition
Helps cameras see better in fog and darkness.