AccuQuant: Simulating Multiple Denoising Steps for Quantizing Diffusion Models
By: Seunghoon Lee , Jeongwoo Choi , Byunggwan Son and more
Potential Business Impact:
Makes AI image generators faster and smaller.
We present in this paper a novel post-training quantization (PTQ) method, dubbed AccuQuant, for diffusion models. We show analytically and empirically that quantization errors for diffusion models are accumulated over denoising steps in a sampling process. To alleviate the error accumulation problem, AccuQuant minimizes the discrepancies between outputs of a full-precision diffusion model and its quantized version within a couple of denoising steps. That is, it simulates multiple denoising steps of a diffusion sampling process explicitly for quantization, accounting the accumulated errors over multiple denoising steps, which is in contrast to previous approaches to imitating a training process of diffusion models, namely, minimizing the discrepancies independently for each step. We also present an efficient implementation technique for AccuQuant, together with a novel objective, which reduces a memory complexity significantly from $\mathcal{O}(n)$ to $\mathcal{O}(1)$, where $n$ is the number of denoising steps. We demonstrate the efficacy and efficiency of AccuQuant across various tasks and diffusion models on standard benchmarks.
Similar Papers
Error Propagation Mechanisms and Compensation Strategies for Quantized Diffusion
CV and Pattern Recognition
Makes AI images faster without losing quality.
Quantizing Diffusion Models from a Sampling-Aware Perspective
CV and Pattern Recognition
Makes AI art creation much faster and better.
Diffusion Model Quantization: A Review
CV and Pattern Recognition
Makes AI art generators run on phones.