Score: 0

On the Reasoning Abilities of Masked Diffusion Language Models

Published: October 15, 2025 | arXiv ID: 2510.13117v1

By: Anej Svete, Ashish Sabharwal

Potential Business Impact:

Computers can solve problems faster by thinking in parallel.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Masked diffusion models (MDMs) for text offer a compelling alternative to traditional autoregressive language models. Parallel generation makes them efficient, but their computational capabilities and the limitations inherent to their parallelism remain largely unexplored. To this end, we characterize what types of reasoning problems MDMs can provably solve and how efficiently. We do this by connecting MDMs to the well-understood reasoning frameworks of chain of thought (CoT) and padded looped transformers (PLTs) in the finite-precision log-width setting: We show that MDMs and polynomially-padded PLTs are, in fact, equivalent in this setting, and that MDMs can solve all problems that CoT-augmented transformers can. Moreover, we showcase classes of problems (including regular languages) for which MDMs are inherently more efficient than CoT transformers, where parallel generation allows for substantially faster reasoning.

No Compute Left Behind: Rethinking Reasoning and Sampling with Masked Diffusion Models

Machine Learning (CS)

Helps computers solve math problems better.

22 Oct 2025 0

90%

Improving Text Style Transfer using Masked Diffusion Language Models with Inference-time Scaling

Computation and Language

Makes computers write better stories and sentences.

14 Aug 2025 2

90%

Improving Text Style Transfer using Masked Diffusion Language Models with Inference-time Scaling

Computation and Language

Makes computers write better stories and sentences.

14 Aug 2025 2

View PDF Login to Bookmark

Country of Origin

🇨🇭 Switzerland

Page Count

35 pages

On the Reasoning Abilities of Masked Diffusion Language Models

Computers can solve problems faster by thinking in parallel.

Technical Abstract

No Compute Left Behind: Rethinking Reasoning and Sampling with Masked Diffusion Models

Improving Text Style Transfer using Masked Diffusion Language Models with Inference-time Scaling

Improving Text Style Transfer using Masked Diffusion Language Models with Inference-time Scaling