Multimodal Perception for Goal-oriented Navigation: A Survey
By: I-Tak Ieong, Hao Tang
Potential Business Impact:
Helps robots learn to find their way around.
Goal-oriented navigation presents a fundamental challenge for autonomous systems, requiring agents to navigate complex environments to reach designated targets. This survey offers a comprehensive analysis of multimodal navigation approaches through the unifying perspective of inference domains, exploring how agents perceive, reason about, and navigate environments using visual, linguistic, and acoustic information. Our key contributions include organizing navigation methods based on their primary environmental reasoning mechanisms across inference domains; systematically analyzing how shared computational foundations support seemingly disparate approaches across different navigation tasks; identifying recurring patterns and distinctive strengths across various navigation paradigms; and examining the integration challenges and opportunities of multimodal perception to enhance navigation capabilities. In addition, we review approximately 200 relevant articles to provide an in-depth understanding of the current landscape.
Similar Papers
Deep Learning-Based Multi-Modal Fusion for Robust Robot Perception and Navigation
Machine Learning (CS)
Helps robots see and move better in tricky places.
GeoNav: Empowering MLLMs with Explicit Geospatial Reasoning Abilities for Language-Goal Aerial Navigation
Robotics
Drones find places using words and maps.
Enhancing Situational Awareness in Underwater Robotics with Multi-modal Spatial Perception
Robotics
Helps robots see and map underwater clearly.