Adaptive Diffusion Policy Optimization for Robotic Manipulation
By: Huiyun Jiang, Zhuang Yang
Potential Business Impact:
Teaches robots to learn tasks faster and better.
Recent studies have shown the great potential of diffusion models in improving reinforcement learning (RL) by modeling complex policies, expressing a high degree of multi-modality, and efficiently handling high-dimensional continuous control tasks. However, there is currently limited research on how to optimize diffusion-based polices (e.g., Diffusion Policy) fast and stably. In this paper, we propose an Adam-based Diffusion Policy Optimization (ADPO), a fast algorithmic framework containing best practices for fine-tuning diffusion-based polices in robotic control tasks using the adaptive gradient descent method in RL. Adaptive gradient method is less studied in training RL, let alone diffusion-based policies. We confirm that ADPO outperforms other diffusion-based RL methods in terms of overall effectiveness for fine-tuning on standard robotic tasks. Concretely, we conduct extensive experiments on standard robotic control tasks to test ADPO, where, particularly, six popular diffusion-based RL methods are provided as benchmark methods. Experimental results show that ADPO acquires better or comparable performance than the baseline methods. Finally, we systematically analyze the sensitivity of multiple hyperparameters in standard robotics tasks, providing guidance for subsequent practical applications. Our video demonstrations are released in https://github.com/Timeless-lab/ADPO.git.
Similar Papers
ADPro: a Test-time Adaptive Diffusion Policy for Robot Manipulation via Manifold and Initial Noise Constraints
Robotics
Robots learn to do tasks faster and better.
RA-DP: Rapid Adaptive Diffusion Policy for Training-Free High-frequency Robotics Replanning
Robotics
Robots quickly learn new tasks in changing places.
Exploration and Adaptation in Non-Stationary Tasks with Diffusion Policies
Artificial Intelligence
Teaches robots to learn new tasks quickly.