Imitation Learning with Precisely Labeled Human Demonstrations
By: Yilong Song
Potential Business Impact:
Teaches robots to learn from human actions.
Within the imitation learning paradigm, training generalist robots requires large-scale datasets obtainable only through diverse curation. Due to the relative ease to collect, human demonstrations constitute a valuable addition when incorporated appropriately. However, existing methods utilizing human demonstrations face challenges in inferring precise actions, ameliorating embodiment gaps, and fusing with frontier generalist robot training pipelines. In this work, building on prior studies that demonstrate the viability of using hand-held grippers for efficient data collection, we leverage the user's control over the gripper's appearance--specifically by assigning it a unique, easily segmentable color--to enable simple and reliable application of the RANSAC and ICP registration method for precise end-effector pose estimation. We show in simulation that precisely labeled human demonstrations on their own allow policies to reach on average 88.1% of the performance of using robot demonstrations, and boost policy performance when combined with robot demonstrations, despite the inherent embodiment gap.
Similar Papers
Phantom: Training Robots Without Robots Using Only Human Videos
Robotics
Robots learn tasks from watching human videos.
Generalist Robot Manipulation beyond Action Labeled Data
Robotics
Robots learn new tasks from watching videos.
Crossing the Human-Robot Embodiment Gap with Sim-to-Real RL using One Human Demonstration
Robotics
Robots learn to do tasks from watching videos.