Score: 0

Multimodal Perception for Goal-oriented Navigation: A Survey

Published: April 22, 2025 | arXiv ID: 2504.15643v1

By: I-Tak Ieong, Hao Tang

Potential Business Impact:

Helps robots learn to find their way around.

Business Areas:

Navigation Navigation and Mapping

Goal-oriented navigation presents a fundamental challenge for autonomous systems, requiring agents to navigate complex environments to reach designated targets. This survey offers a comprehensive analysis of multimodal navigation approaches through the unifying perspective of inference domains, exploring how agents perceive, reason about, and navigate environments using visual, linguistic, and acoustic information. Our key contributions include organizing navigation methods based on their primary environmental reasoning mechanisms across inference domains; systematically analyzing how shared computational foundations support seemingly disparate approaches across different navigation tasks; identifying recurring patterns and distinctive strengths across various navigation paradigms; and examining the integration challenges and opportunities of multimodal perception to enhance navigation capabilities. In addition, we review approximately 200 relevant articles to provide an in-depth understanding of the current landscape.

Deep Learning-Based Multi-Modal Fusion for Robust Robot Perception and Navigation

Machine Learning (CS)

Helps robots see and move better in tricky places.

26 Apr 2025 1

89%

GeoNav: Empowering MLLMs with Explicit Geospatial Reasoning Abilities for Language-Goal Aerial Navigation

Robotics

Drones find places using words and maps.

13 Apr 2025 1

88%

Enhancing Situational Awareness in Underwater Robotics with Multi-modal Spatial Perception

Robotics

Helps robots see and map underwater clearly.

6 Jun 2025 1

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

20 pages

Multimodal Perception for Goal-oriented Navigation: A Survey

Helps robots learn to find their way around.

Technical Abstract

Deep Learning-Based Multi-Modal Fusion for Robust Robot Perception and Navigation

GeoNav: Empowering MLLMs with Explicit Geospatial Reasoning Abilities for Language-Goal Aerial Navigation

Enhancing Situational Awareness in Underwater Robotics with Multi-modal Spatial Perception