Teaching Robots Like Dogs: Learning Agile Navigation from Luring, Gesture, and Speech
By: Taerim Yoon , Dongho Kang , Jin Cheng and more
In this work, we aim to enable legged robots to learn how to interpret human social cues and produce appropriate behaviors through physical human guidance. However, learning through physical engagement can place a heavy burden on users when the process requires large amounts of human-provided data. To address this, we propose a human-in-the-loop framework that enables robots to acquire navigational behaviors in a data-efficient manner and to be controlled via multimodal natural human inputs, specifically gestural and verbal commands. We reconstruct interaction scenes using a physics-based simulation and aggregate data to mitigate distributional shifts arising from limited demonstration data. Our progressive goal cueing strategy adaptively feeds appropriate commands and navigation goals during training, leading to more accurate navigation and stronger alignment between human input and robot behavior. We evaluate our framework across six real-world agile navigation scenarios, including jumping over or avoiding obstacles. Our experimental results show that our proposed method succeeds in almost all trials across these scenarios, achieving a 97.15% task success rate with less than 1 hour of demonstration data in total.
Similar Papers
Learning to Generate Pointing Gestures in Situated Embodied Conversational Agents
Robotics
Robots learn to point and talk naturally.
UrbanNav: Learning Language-Guided Urban Navigation from Web-Scale Human Trajectories
Robotics
Helps robots follow spoken directions in cities.
Assessing Human Cooperation for Enhancing Social Robot Navigation
Robotics
Robots learn to talk to people better.