Latent Discrete Diffusion Models
By: Dario Shariatian, Alain Durmus, Stefano Peluchetti
Potential Business Impact:
Makes computers write better stories and sentences.
We study discrete diffusion for language and other categorical data and focus on a common limitation of masked denoisers: reverse transitions typically factorize across positions, which can weaken joint structure and degrade quality in few-step generation. We propose \emph{Latent Discrete Diffusion Models} (LDDMs), which couple a masked discrete diffusion over tokens with a continuous diffusion over latent embeddings. The latent channel provides a softer signal and carries cross-token dependencies that help resolve ambiguities. We present two instantiations: (i) FUJI-LDDMs, which perform fully joint denoising of tokens and latents, and (ii) SEQ-LDDMs, which sequentially resolve the latent and then the discrete chain conditionally on it. For both variants we derive ELBO-style objectives and discuss design choices to learn informative latents yet amenable to diffusoin modeling. In experiments, LDDMs yield improvements on unconditional generation metrics as compared to state-of-the-art masked discrete diffusion baselines, and are effective at lower sampling budgets, where unmasking many tokens per step is desirable.
Similar Papers
Coevolutionary Continuous Discrete Diffusion: Make Your Diffusion Language Model a Latent Reasoner
Artificial Intelligence
Makes AI understand words better by thinking continuously.
Simple Denoising Diffusion Language Models
Machine Learning (CS)
Makes computers write better stories and sentences.
Encoder-Decoder Diffusion Language Models for Efficient Training and Inference
Machine Learning (CS)
Makes AI write and understand faster.