Iterative Compositional Data Generation for Robot Control
By: Anh-Quan Pham , Marcel Hussing , Shubhankar P. Patankar and more
Potential Business Impact:
Robots learn new tasks by combining old skills.
Collecting robotic manipulation data is expensive, making it impractical to acquire demonstrations for the combinatorially large space of tasks that arise in multi-object, multi-robot, and multi-environment settings. While recent generative models can synthesize useful data for individual tasks, they do not exploit the compositional structure of robotic domains and struggle to generalize to unseen task combinations. We propose a semantic compositional diffusion transformer that factorizes transitions into robot-, object-, obstacle-, and objective-specific components and learns their interactions through attention. Once trained on a limited subset of tasks, we show that our model can zero-shot generate high-quality transitions from which we can learn control policies for unseen task combinations. Then, we introduce an iterative self-improvement procedure in which synthetic data is validated via offline reinforcement learning and incorporated into subsequent training rounds. Our approach substantially improves zero-shot performance over monolithic and hard-coded compositional baselines, ultimately solving nearly all held-out tasks and demonstrating the emergence of meaningful compositional structure in the learned representations.
Similar Papers
A Compositional Paradigm for Foundation Models: Towards Smarter Robotic Agents
Robotics
AI learns new things without forgetting old ones.
Compose by Focus: Scene Graph-based Atomic Skills
Robotics
Robots learn to combine simple actions for new tasks.
Composition-Incremental Learning for Compositional Generalization
CV and Pattern Recognition
Teaches computers to learn new things over time.