Score: 0

Conditioning Diffusions Using Malliavin Calculus

Published: April 4, 2025 | arXiv ID: 2504.03461v2

By: Jakiw Pidstrigach , Elizabeth Baker , Carles Domingo-Enrich and more

Potential Business Impact:

Makes AI learn from rewards that are hard to measure.

Business Areas:

Prediction Markets Financial Services

In generative modelling and stochastic optimal control, a central computational task is to modify a reference diffusion process to maximise a given terminal-time reward. Most existing methods require this reward to be differentiable, using gradients to steer the diffusion towards favourable outcomes. However, in many practical settings, like diffusion bridges, the reward is singular, taking an infinite value if the target is hit and zero otherwise. We introduce a novel framework, based on Malliavin calculus and centred around a generalisation of the Tweedie score formula to nonlinear stochastic differential equations, that enables the development of methods robust to such singularities. This allows our approach to handle a broad range of applications, like diffusion bridges, or adding conditional controls to an already trained diffusion model. We demonstrate that our approach offers stable and reliable training, outperforming existing techniques. As a byproduct, we also introduce a novel score matching objective. Our loss functions are formulated such that they could readily be extended to manifold-valued and infinite dimensional diffusions.