T$^\star$: Progressive Block Scaling for MDM Through Trajectory Aware RL
By: Hanchen Xia , Baoyou Chen , Yutang Ge and more
Potential Business Impact:
Makes AI better at solving math problems faster.
We present T$^\star$, a simple \textsc{TraceRL}-based training curriculum for progressive block-size scaling in masked diffusion language models (MDMs). Starting from an AR-initialized small-block MDM, T$^\star$~transitions smoothly to larger blocks, enabling higher-parallelism decoding with minimal performance degradation on math reasoning benchmarks. Moreover, further analysis suggests that T$^\star$~can converge to an alternative decoding schedule $\hat{\rm S}$ that achieves comparable performance.
Similar Papers
From Next-Token to Next-Block: A Principled Adaptation Path for Diffusion LLMs
Computation and Language
Makes AI write faster by using new tricks.
Trust Region Masking for Long-Horizon LLM Reinforcement Learning
Machine Learning (CS)
Helps AI learn better for longer tasks.
Improving Text Style Transfer using Masked Diffusion Language Models with Inference-time Scaling
Computation and Language
Makes computers write better stories and sentences.