R2RGEN: Real-to-Real 3D Data Generation for Spatially Generalized Manipulation
By: Xiuwei Xu , Angyuan Ma , Hankun Li and more
Potential Business Impact:
Robots learn to grab things in new places.
Towards the aim of generalized robotic manipulation, spatial generalization is the most fundamental capability that requires the policy to work robustly under different spatial distribution of objects, environment and agent itself. To achieve this, substantial human demonstrations need to be collected to cover different spatial configurations for training a generalized visuomotor policy via imitation learning. Prior works explore a promising direction that leverages data generation to acquire abundant spatially diverse data from minimal source demonstrations. However, most approaches face significant sim-to-real gap and are often limited to constrained settings, such as fixed-base scenarios and predefined camera viewpoints. In this paper, we propose a real-to-real 3D data generation framework (R2RGen) that directly augments the pointcloud observation-action pairs to generate real-world data. R2RGen is simulator- and rendering-free, thus being efficient and plug-and-play. Specifically, given a single source demonstration, we introduce an annotation mechanism for fine-grained parsing of scene and trajectory. A group-wise augmentation strategy is proposed to handle complex multi-object compositions and diverse task constraints. We further present camera-aware processing to align the distribution of generated data with real-world 3D sensor. Empirically, R2RGen substantially enhances data efficiency on extensive experiments and demonstrates strong potential for scaling and application on mobile manipulation.
Similar Papers
Real2Render2Real: Scaling Robot Data Without Dynamics Simulation or Robot Hardware
Robotics
Teaches robots using phone scans and videos.
IGen: Scalable Data Generation for Robot Learning from Open-World Images
Robotics
Teaches robots to do tasks using everyday pictures.
\textsc{Gen2Real}: Towards Demo-Free Dexterous Manipulation by Harnessing Generated Video
Robotics
Robots learn to grab things from watching videos.