HyPerNav: Hybrid Perception for Object-Oriented Navigation in Unknown Environment
By: Zecheng Yin, Hao Zhao, Zhen Li
Potential Business Impact:
Robots find things better using two kinds of sight.
Objective-oriented navigation(ObjNav) enables robot to navigate to target object directly and autonomously in an unknown environment. Effective perception in navigation in unknown environment is critical for autonomous robots. While egocentric observations from RGB-D sensors provide abundant local information, real-time top-down maps offer valuable global context for ObjNav. Nevertheless, the majority of existing studies focus on a single source, seldom integrating these two complementary perceptual modalities, despite the fact that humans naturally attend to both. With the rapid advancement of Vision-Language Models(VLMs), we propose Hybrid Perception Navigation (HyPerNav), leveraging VLMs' strong reasoning and vision-language understanding capabilities to jointly perceive both local and global information to enhance the effectiveness and intelligence of navigation in unknown environments. In both massive simulation evaluation and real-world validation, our methods achieved state-of-the-art performance against popular baselines. Benefiting from hybrid perception approach, our method captures richer cues and finds the objects more effectively, by simultaneously leveraging information understanding from egocentric observations and the top-down map. Our ablation study further proved that either of the hybrid perception contributes to the navigation performance.
Similar Papers
HyPerNav: Hybrid Perception for Object-Oriented Navigation in Unknown Environment
Robotics
Robots find things better using two kinds of views.
Uncertainty-Informed Active Perception for Open Vocabulary Object Goal Navigation
Robotics
Robot finds objects better, even with tricky words.
What Matters in RL-Based Methods for Object-Goal Navigation? An Empirical Study and A Unified Framework
Robotics
Robots find objects in new places better.