Language-Grounded Hierarchical Planning and Execution with Multi-Robot 3D Scene Graphs
By: Jared Strader , Aaron Ray , Jacob Arkin and more
Potential Business Impact:
Robots work together to follow spoken commands.
In this paper, we introduce a multi-robot system that integrates mapping, localization, and task and motion planning (TAMP) enabled by 3D scene graphs to execute complex instructions expressed in natural language. Our system builds a shared 3D scene graph incorporating an open-set object-based map, which is leveraged for multi-robot 3D scene graph fusion. This representation supports real-time, view-invariant relocalization (via the object-based map) and planning (via the 3D scene graph), allowing a team of robots to reason about their surroundings and execute complex tasks. Additionally, we introduce a planning approach that translates operator intent into Planning Domain Definition Language (PDDL) goals using a Large Language Model (LLM) by leveraging context from the shared 3D scene graph and robot capabilities. We provide an experimental assessment of the performance of our system on real-world tasks in large-scale, outdoor environments. A supplementary video is available at https://youtu.be/8xbGGOLfLAY.
Similar Papers
LLM-GROP: Visually Grounded Robot Task and Motion Planning with Large Language Models
Robotics
Robot learns to set tables using common sense.
Hierarchical Language Models for Semantic Navigation and Manipulation in an Aerial-Ground Robotic System
Robotics
Robots work together better using AI to move things.
Hierarchical Temporal Logic Task and Motion Planning for Multi-Robot Systems
Robotics
Robots work together to finish jobs faster.