Score: 0

Decentralized Autoregressive Generation

Published: January 6, 2026 | arXiv ID: 2601.03184v1

By: Stepan Maschan, Haoxuan Qu, Jun Liu

Potential Business Impact:

Makes AI models learn faster and better together.

Business Areas:
Machine Learning Artificial Intelligence, Data and Analytics, Software

We present a theoretical analysis of decentralization of autoregressive generation. We define the Decentralized Discrete Flow Matching objective, by expressing probability generating velocity as a linear combination of expert flows. We also conduct experiments demonstrating the equivalence between decentralized and centralized training settings for multimodal language models across diverse set of benchmarks. Specifically, we compare two distinct paradigms: LLaVA and InternVL 2.5-1B, which uses a fixed CLIP vision encoder and performs full-parameter fine-tuning (ViT+MLP+LLM) during the instruction tuning stage.

Page Count
11 pages

Category
Computer Science:
Machine Learning (CS)