Hybrid Consistency Policy: Decoupling Multi-Modal Diversity and Real-Time Efficiency in Robotic Manipulation
By: Qianyou Zhao , Yuliang Shen , Xuanran Zhai and more
Potential Business Impact:
Robots learn to move faster, like humans.
In visuomotor policy learning, diffusion-based imitation learning has become widely adopted for its ability to capture diverse behaviors. However, approaches built on ordinary and stochastic denoising processes struggle to jointly achieve fast sampling and strong multi-modality. To address these challenges, we propose the Hybrid Consistency Policy (HCP). HCP runs a short stochastic prefix up to an adaptive switch time, and then applies a one-step consistency jump to produce the final action. To align this one-jump generation, HCP performs time-varying consistency distillation that combines a trajectory-consistency objective to keep neighboring predictions coherent and a denoising-matching objective to improve local fidelity. In both simulation and on a real robot, HCP with 25 SDE steps plus one jump approaches the 80-step DDPM teacher in accuracy and mode coverage while significantly reducing latency. These results show that multi-modality does not require slow inference, and a switch time decouples mode retention from speed. It yields a practical accuracy efficiency trade-off for robot policies.
Similar Papers
Imitation Learning Policy based on Multi-Step Consistent Integration Shortcut Model
Robotics
Teaches robots to copy actions much faster.
Hybrid-Diffusion Models: Combining Open-loop Routines with Visuomotor Diffusion Policies
Robotics
Robots learn to do tricky jobs faster and better.
Improving Robustness to Out-of-Distribution States in Imitation Learning via Deep Koopman-Boosted Diffusion Policy
Robotics
Robots learn to do tasks better by watching and feeling.