Learning Diffusion Policy from Primitive Skills for Robot Manipulation
By: Zhihao Gu , Ming Yang , Difan Zou and more
Potential Business Impact:
Teaches robots to do tasks by breaking them down.
Diffusion policies (DP) have recently shown great promise for generating actions in robotic manipulation. However, existing approaches often rely on global instructions to produce short-term control signals, which can result in misalignment in action generation. We conjecture that the primitive skills, referred to as fine-grained, short-horizon manipulations, such as ``move up'' and ``open the gripper'', provide a more intuitive and effective interface for robot learning. To bridge this gap, we propose SDP, a skill-conditioned DP that integrates interpretable skill learning with conditional action planning. SDP abstracts eight reusable primitive skills across tasks and employs a vision-language model to extract discrete representations from visual observations and language instructions. Based on them, a lightweight router network is designed to assign a desired primitive skill for each state, which helps construct a single-skill policy to generate skill-aligned actions. By decomposing complex tasks into a sequence of primitive skills and selecting a single-skill policy, SDP ensures skill-consistent behavior across diverse tasks. Extensive experiments on two challenging simulation benchmarks and real-world robot deployments demonstrate that SDP consistently outperforms SOTA methods, providing a new paradigm for skill-based robot learning with diffusion policies.
Similar Papers
ADPro: a Test-time Adaptive Diffusion Policy for Robot Manipulation via Manifold and Initial Noise Constraints
Robotics
Robots learn to do tasks faster and better.
Diffusion Trajectory-guided Policy for Long-horizon Robot Manipulation
Robotics
Teaches robots to do long tasks better.
On-Device Diffusion Transformer Policy for Efficient Robot Manipulation
Robotics
Makes robots do tasks on small computers.