Solving Bayesian inverse problems with diffusion priors and off-policy RL
By: Luca Scimeca , Siddarth Venkatraman , Moksh Jain and more
Potential Business Impact:
Makes AI better at solving hard science puzzles.
This paper presents a practical application of Relative Trajectory Balance (RTB), a recently introduced off-policy reinforcement learning (RL) objective that can asymptotically solve Bayesian inverse problems optimally. We extend the original work by using RTB to train conditional diffusion model posteriors from pretrained unconditional priors for challenging linear and non-linear inverse problems in vision, and science. We use the objective alongside techniques such as off-policy backtracking exploration to improve training. Importantly, our results show that existing training-free diffusion posterior methods struggle to perform effective posterior inference in latent space due to inherent biases.
Similar Papers
Relative Trajectory Balance is equivalent to Trust-PCL
Machine Learning (CS)
Makes AI models create better, more focused results.
BiTrajDiff: Bidirectional Trajectory Generation with Diffusion Models for Offline Reinforcement Learning
Machine Learning (CS)
Helps robots learn better from past experiences.
Real-Time Iteration Scheme for Diffusion Policy
Robotics
Makes robots move faster without retraining.