Spherical Leech Quantization for Visual Tokenization and Generation
By: Yue Zhao , Hanwen Jiang , Zhenlin Xu and more
Potential Business Impact:
Makes pictures clearer when saved with less space.
Non-parametric quantization has received much attention due to its efficiency on parameters and scalability to a large codebook. In this paper, we present a unified formulation of different non-parametric quantization methods through the lens of lattice coding. The geometry of lattice codes explains the necessity of auxiliary loss terms when training auto-encoders with certain existing lookup-free quantization variants such as BSQ. As a step forward, we explore a few possible candidates, including random lattices, generalized Fibonacci lattices, and densest sphere packing lattices. Among all, we find the Leech lattice-based quantization method, which is dubbed as Spherical Leech Quantization ($Λ_{24}$-SQ), leads to both a simplified training recipe and an improved reconstruction-compression tradeoff thanks to its high symmetry and even distribution on the hypersphere. In image tokenization and compression tasks, this quantization approach achieves better reconstruction quality across all metrics than BSQ, the best prior art, while consuming slightly fewer bits. The improvement also extends to state-of-the-art auto-regressive image generation frameworks.
Similar Papers
Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression
Machine Learning (CS)
Makes big computer brains smaller and faster.
Improving 3D Gaussian Splatting Compression by Scene-Adaptive Lattice Vector Quantization
CV and Pattern Recognition
Makes 3D pictures smaller without losing quality.
SQS: Bayesian DNN Compression through Sparse Quantized Sub-distributions
Machine Learning (CS)
Makes AI smaller and faster for phones.