Hierarchical Scoring with 3D Gaussian Splatting for Instance Image-Goal Navigation
By: Yijie Deng , Shuaihang Yuan , Geeta Chandra Raju Bethala and more
Potential Business Impact:
Finds targets faster by picking smart views.
Instance Image-Goal Navigation (IIN) requires autonomous agents to identify and navigate to a target object or location depicted in a reference image captured from any viewpoint. While recent methods leverage powerful novel view synthesis (NVS) techniques, such as three-dimensional Gaussian splatting (3DGS), they typically rely on randomly sampling multiple viewpoints or trajectories to ensure comprehensive coverage of discriminative visual cues. This approach, however, creates significant redundancy through overlapping image samples and lacks principled view selection, substantially increasing both rendering and comparison overhead. In this paper, we introduce a novel IIN framework with a hierarchical scoring paradigm that estimates optimal viewpoints for target matching. Our approach integrates cross-level semantic scoring, utilizing CLIP-derived relevancy fields to identify regions with high semantic similarity to the target object class, with fine-grained local geometric scoring that performs precise pose estimation within promising regions. Extensive evaluations demonstrate that our method achieves state-of-the-art performance on simulated IIN benchmarks and real-world applicability.
Similar Papers
SplatSearch: Instance Image Goal Navigation for Mobile Robots using 3D Gaussian Splatting and Diffusion Models
Robotics
Robot finds lost things using a picture.
GSplatVNM: Point-of-View Synthesis for Visual Navigation Models Using Gaussian Splatting
Robotics
Robot finds its way using fewer pictures.
IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation
CV and Pattern Recognition
Helps robots find things using just a picture.