Generative Preprocessing for Image Compression with Pre-trained Diffusion Models
By: Mengxi Guo , Shijie Zhao , Junlin Li and more
Potential Business Impact:
Makes pictures look better when they're made smaller.
Preprocessing is a well-established technique for optimizing compression, yet existing methods are predominantly Rate-Distortion (R-D) optimized and constrained by pixel-level fidelity. This work pioneers a shift towards Rate-Perception (R-P) optimization by, for the first time, adapting a large-scale pre-trained diffusion model for compression preprocessing. We propose a two-stage framework: first, we distill the multi-step Stable Diffusion 2.1 into a compact, one-step image-to-image model using Consistent Score Identity Distillation (CiD). Second, we perform a parameter-efficient fine-tuning of the distilled model's attention modules, guided by a Rate-Perception loss and a differentiable codec surrogate. Our method seamlessly integrates with standard codecs without any modification and leverages the model's powerful generative priors to enhance texture and mitigate artifacts. Experiments show substantial R-P gains, achieving up to a 30.13% BD-rate reduction in DISTS on the Kodak dataset and delivering superior subjective visual quality.
Similar Papers
Generative Image Coding with Diffusion Prior
CV and Pattern Recognition
Makes pictures look good even when squeezed small.
A Preprocessing Framework for Video Machine Vision under Compression
Multimedia
Makes videos smaller for computers to understand.
Edit2Perceive: Image Editing Diffusion Models Are Strong Dense Perceivers
CV and Pattern Recognition
Makes computers understand pictures better for tasks.