Generative Latent Diffusion for Efficient Spatiotemporal Data Reduction
By: Xiao Li , Liangji Zhu , Anand Rangarajan and more
Potential Business Impact:
Saves space by smartly guessing missing video parts.
Generative models have demonstrated strong performance in conditional settings and can be viewed as a form of data compression, where the condition serves as a compact representation. However, their limited controllability and reconstruction accuracy restrict their practical application to data compression. In this work, we propose an efficient latent diffusion framework that bridges this gap by combining a variational autoencoder with a conditional diffusion model. Our method compresses only a small number of keyframes into latent space and uses them as conditioning inputs to reconstruct the remaining frames via generative interpolation, eliminating the need to store latent representations for every frame. This approach enables accurate spatiotemporal reconstruction while significantly reducing storage costs. Experimental results across multiple datasets show that our method achieves up to 10 times higher compression ratios than rule-based state-of-the-art compressors such as SZ3, and up to 63 percent better performance than leading learning-based methods under the same reconstruction error.
Similar Papers
Conditional Video Generation for High-Efficiency Video Compression
CV and Pattern Recognition
Makes videos look better with less data.
DGAE: Diffusion-Guided Autoencoder for Efficient Latent Representation Learning
CV and Pattern Recognition
Makes pictures smaller, clearer, and faster to make.
Higher fidelity perceptual image and video compression with a latent conditioned residual denoising diffusion model
Image and Video Processing
Makes pictures look good while keeping details.