FOM-Nav: Frontier-Object Maps for Object Goal Navigation
By: Thomas Chabal , Shizhe Chen , Jean Ponce and more
Potential Business Impact:
Robot finds hidden things faster in new places.
This paper addresses the Object Goal Navigation problem, where a robot must efficiently find a target object in an unknown environment. Existing implicit memory-based methods struggle with long-term memory retention and planning, while explicit map-based approaches lack rich semantic information. To address these challenges, we propose FOM-Nav, a modular framework that enhances exploration efficiency through Frontier-Object Maps and vision-language models. Our Frontier-Object Maps are built online and jointly encode spatial frontiers and fine-grained object information. Using this representation, a vision-language model performs multimodal scene understanding and high-level goal prediction, which is executed by a low-level planner for efficient trajectory generation. To train FOM-Nav, we automatically construct large-scale navigation datasets from real-world scanned environments. Extensive experiments validate the effectiveness of our model design and constructed dataset. FOM-Nav achieves state-of-the-art performance on the MP3D and HM3D benchmarks, particularly in navigation efficiency metric SPL, and yields promising results on a real robot.
Similar Papers
Embodied Navigation Foundation Model
Robotics
Robots learn to move anywhere, doing many jobs.
What Matters in RL-Based Methods for Object-Goal Navigation? An Empirical Study and A Unified Framework
Robotics
Robots find objects in new places better.
SemNav: A Model-Based Planner for Zero-Shot Object Goal Navigation Using Vision-Foundation Models
Robotics
Helps robots find things without prior training.