Score: 1

Fast and Fluent Diffusion Language Models via Convolutional Decoding and Rejective Fine-tuning

Published: September 18, 2025 | arXiv ID: 2509.15188v1

By: Yeongbin Seo , Dongha Lee , Jaehyung Kim and more

Potential Business Impact:

Makes AI write better and faster.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Autoregressive (AR) language models generate text one token at a time, which limits their inference speed. Diffusion-based language models offer a promising alternative, as they can decode multiple tokens in parallel. However, we identify a key bottleneck in current diffusion LMs: the long decoding-window problem, where tokens generated far from the input context often become irrelevant or repetitive. Previous solutions like semi-autoregressive address this issue by splitting windows into blocks, but this sacrifices speed and bidirectionality, eliminating the main advantage of diffusion models. To overcome this, we propose Convolutional decoding (Conv), a normalization-based method that narrows the decoding window without hard segmentation, leading to better fluency and flexibility. Additionally, we introduce Rejecting Rule-based Fine-Tuning (R2FT), a post-hoc training scheme that better aligns tokens at positions far from context. Our methods achieve state-of-the-art results on open-ended generation benchmarks (e.g., AlpacaEval) among diffusion LM baselines, with significantly lower step size than previous works, demonstrating both speed and quality improvements.

Blockwise SFT for Diffusion Language Models: Reconciling Bidirectional Attention and Autoregressive Decoding

Computation and Language

Teaches AI to write better by training it block by block.

27 Aug 2025 2

90%

Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States

Computation and Language

Makes AI write faster and smarter.

13 Oct 2025 0

89%

Diffusion LLMs Can Do Faster-Than-AR Inference via Discrete Diffusion Forcing

Machine Learning (CS)

Makes AI write much faster than before.

8 Aug 2025 2

View PDF Login to Bookmark

Country of Origin

🇰🇷 Korea, Republic of

Repos / Data Links

github.com

Page Count

33 pages

Fast and Fluent Diffusion Language Models via Convolutional Decoding and Rejective Fine-tuning

Makes AI write better and faster.

Technical Abstract

Blockwise SFT for Diffusion Language Models: Reconciling Bidirectional Attention and Autoregressive Decoding

Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States

Diffusion LLMs Can Do Faster-Than-AR Inference via Discrete Diffusion Forcing