ExploreVLM: Closed-Loop Robot Exploration Task Planning with Vision-Language Models
By: Zhichen Lou , Kechun Xu , Zhongxiang Zhou and more
Potential Business Impact:
Robots learn to explore and do tasks better.
The advancement of embodied intelligence is accelerating the integration of robots into daily life as human assistants. This evolution requires robots to not only interpret high-level instructions and plan tasks but also perceive and adapt within dynamic environments. Vision-Language Models (VLMs) present a promising solution by combining visual understanding and language reasoning. However, existing VLM-based methods struggle with interactive exploration, accurate perception, and real-time plan adaptation. To address these challenges, we propose ExploreVLM, a novel closed-loop task planning framework powered by Vision-Language Models (VLMs). The framework is built around a step-wise feedback mechanism that enables real-time plan adjustment and supports interactive exploration. At its core is a dual-stage task planner with self-reflection, enhanced by an object-centric spatial relation graph that provides structured, language-grounded scene representations to guide perception and planning. An execution validator supports the closed loop by verifying each action and triggering re-planning. Extensive real-world experiments demonstrate that ExploreVLM significantly outperforms state-of-the-art baselines, particularly in exploration-centric tasks. Ablation studies further validate the critical role of the reflective planner and structured perception in achieving robust and efficient task execution.
Similar Papers
Perceiving, Reasoning, Adapting: A Dual-Layer Framework for VLM-Guided Precision Robotic Manipulation
Robotics
Robots learn to do tricky jobs with speed and accuracy.
Experience is the Best Teacher: Grounding VLMs for Robotics through Self-Generated Memory
Robotics
Robots learn from mistakes to do tasks better.
Think, Remember, Navigate: Zero-Shot Object-Goal Navigation with VLM-Powered Reasoning
Robotics
Helps robots explore new places much faster.