Score: 1

MAG-Nav: Language-Driven Object Navigation Leveraging Memory-Reserved Active Grounding

Published: August 7, 2025 | arXiv ID: 2508.05021v1

By: Weifan Zhang, Tingguang Li, Yuzhen Liu

Potential Business Impact:

Robots follow spoken directions in new places.

Visual navigation in unknown environments based solely on natural language descriptions is a key capability for intelligent robots. In this work, we propose a navigation framework built upon off-the-shelf Visual Language Models (VLMs), enhanced with two human-inspired mechanisms: perspective-based active grounding, which dynamically adjusts the robot's viewpoint for improved visual inspection, and historical memory backtracking, which enables the system to retain and re-evaluate uncertain observations over time. Unlike existing approaches that passively rely on incidental visual inputs, our method actively optimizes perception and leverages memory to resolve ambiguity, significantly improving vision-language grounding in complex, unseen environments. Our framework operates in a zero-shot manner, achieving strong generalization to diverse and open-ended language descriptions without requiring labeled data or model fine-tuning. Experimental results on Habitat-Matterport 3D (HM3D) show that our method outperforms state-of-the-art approaches in language-driven object navigation. We further demonstrate its practicality through real-world deployment on a quadruped robot, achieving robust and effective navigation performance.

Think, Remember, Navigate: Zero-Shot Object-Goal Navigation with VLM-Powered Reasoning

Robotics

Helps robots explore new places much faster.

12 Nov 2025 0

90%

Hierarchical Language Models for Semantic Navigation and Manipulation in an Aerial-Ground Robotic System

Robotics

Robots work together better using AI to move things.

5 Jun 2025 1

90%

Vision-and-Language Navigation with Analogical Textual Descriptions in LLMs

Artificial Intelligence

Helps robots understand places better to find their way.

29 Sep 2025 1

View PDF Login to Bookmark

Page Count

10 pages

MAG-Nav: Language-Driven Object Navigation Leveraging Memory-Reserved Active Grounding

Robots follow spoken directions in new places.

Technical Abstract

Think, Remember, Navigate: Zero-Shot Object-Goal Navigation with VLM-Powered Reasoning

Hierarchical Language Models for Semantic Navigation and Manipulation in an Aerial-Ground Robotic System

Vision-and-Language Navigation with Analogical Textual Descriptions in LLMs