Score: 0

DiWA: Diffusion Policy Adaptation with World Models

Published: August 5, 2025 | arXiv ID: 2508.03645v1

By: Akshay L Chandra , Iman Nematollahi , Chenguang Huang and more

Potential Business Impact:

Teaches robots new tricks without real-world practice.

Fine-tuning diffusion policies with reinforcement learning (RL) presents significant challenges. The long denoising sequence for each action prediction impedes effective reward propagation. Moreover, standard RL methods require millions of real-world interactions, posing a major bottleneck for practical fine-tuning. Although prior work frames the denoising process in diffusion policies as a Markov Decision Process to enable RL-based updates, its strong dependence on environment interaction remains highly inefficient. To bridge this gap, we introduce DiWA, a novel framework that leverages a world model for fine-tuning diffusion-based robotic skills entirely offline with reinforcement learning. Unlike model-free approaches that require millions of environment interactions to fine-tune a repertoire of robot skills, DiWA achieves effective adaptation using a world model trained once on a few hundred thousand offline play interactions. This results in dramatically improved sample efficiency, making the approach significantly more practical and safer for real-world robot learning. On the challenging CALVIN benchmark, DiWA improves performance across eight tasks using only offline adaptation, while requiring orders of magnitude fewer physical interactions than model-free baselines. To our knowledge, this is the first demonstration of fine-tuning diffusion policies for real-world robotic skills using an offline world model. We make the code publicly available at https://diwa.cs.uni-freiburg.de.

World4RL: Diffusion World Models for Policy Refinement with Reinforcement Learning for Robotic Manipulation

Robotics

Teaches robots new skills without real-world practice.

23 Sep 2025 0

90%

DAWM: Diffusion Action World Models for Offline Reinforcement Learning via Action-Inferred Transitions

Machine Learning (CS)

Teaches robots to learn from past experiences.

23 Sep 2025 0

89%

LaDi-WM: A Latent Diffusion-based World Model for Predictive Manipulation

Robotics

Helps robots learn to do tasks better.

13 May 2025 1

View PDF Login to Bookmark

Page Count

23 pages

DiWA: Diffusion Policy Adaptation with World Models

Teaches robots new tricks without real-world practice.

Technical Abstract

World4RL: Diffusion World Models for Policy Refinement with Reinforcement Learning for Robotic Manipulation

DAWM: Diffusion Action World Models for Offline Reinforcement Learning via Action-Inferred Transitions

LaDi-WM: A Latent Diffusion-based World Model for Predictive Manipulation