Towards Better & Faster Autoregressive Image Generation: From the Perspective of Entropy
By: Xiaoxiao Ma , Feng Zhao , Pengyang Ling and more
Potential Business Impact:
Makes AI draw better pictures, faster.
In this work, we first revisit the sampling issues in current autoregressive (AR) image generation models and identify that image tokens, unlike text tokens, exhibit lower information density and non-uniform spatial distribution. Accordingly, we present an entropy-informed decoding strategy that facilitates higher autoregressive generation quality with faster synthesis speed. Specifically, the proposed method introduces two main innovations: 1) dynamic temperature control guided by spatial entropy of token distributions, enhancing the balance between content diversity, alignment accuracy, and structural coherence in both mask-based and scale-wise models, without extra computational overhead, and 2) entropy-aware acceptance rules in speculative decoding, achieving near-lossless generation at about 85\% of the inference cost of conventional acceleration methods. Extensive experiments across multiple benchmarks using diverse AR image generation models demonstrate the effectiveness and generalizability of our approach in enhancing both generation quality and sampling speed.
Similar Papers
DiverseAR: Boosting Diversity in Bitwise Autoregressive Image Generation
CV and Pattern Recognition
Makes AI create more varied and realistic pictures.
Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation
CV and Pattern Recognition
Makes AI better at understanding and creating pictures.
Learning to Expand Images for Efficient Visual Autoregressive Modeling
CV and Pattern Recognition
Makes AI draw pictures faster and better.