Score: 1

Plug-and-Play Fidelity Optimization for Diffusion Transformer Acceleration via Cumulative Error Minimization

Published: December 29, 2025 | arXiv ID: 2512.23258v1

By: Tong Shao , Yusen Fu , Guoying Sun and more

Potential Business Impact:

Makes AI art and videos create much faster.

Business Areas:

Image Recognition Data and Analytics, Software

Although Diffusion Transformer (DiT) has emerged as a predominant architecture for image and video generation, its iterative denoising process results in slow inference, which hinders broader applicability and development. Caching-based methods achieve training-free acceleration, while suffering from considerable computational error. Existing methods typically incorporate error correction strategies such as pruning or prediction to mitigate it. However, their fixed caching strategy fails to adapt to the complex error variations during denoising, which limits the full potential of error correction. To tackle this challenge, we propose a novel fidelity-optimization plugin for existing error correction methods via cumulative error minimization, named CEM. CEM predefines the error to characterize the sensitivity of model to acceleration jointly influenced by timesteps and cache intervals. Guided by this prior, we formulate a dynamic programming algorithm with cumulative error approximation for strategy optimization, which achieves the caching error minimization, resulting in a substantial improvement in generation fidelity. CEM is model-agnostic and exhibits strong generalization, which is adaptable to arbitrary acceleration budgets. It can be seamlessly integrated into existing error correction frameworks and quantized models without introducing any additional computational overhead. Extensive experiments conducted on nine generation models and quantized methods across three tasks demonstrate that CEM significantly improves generation fidelity of existing acceleration models, and outperforms the original generation performance on FLUX.1-dev, PixArt-$α$, StableDiffusion1.5 and Hunyuan. The code will be made publicly available.

ProCache: Constraint-Aware Feature Caching with Selective Computation for Diffusion Transformer Acceleration

CV and Pattern Recognition

Makes AI image creation much faster.

19 Dec 2025 1

88%

Accelerating Diffusion Transformer via Error-Optimized Cache

CV and Pattern Recognition

Makes AI create pictures faster without losing quality.

31 Jan 2025 1

88%

ETC: training-free diffusion models acceleration with Error-aware Trend Consistency

CV and Pattern Recognition

Makes AI art faster without losing quality.

28 Oct 2025 1

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Repos / Data Links

github.com

Page Count

30 pages

Plug-and-Play Fidelity Optimization for Diffusion Transformer Acceleration via Cumulative Error Minimization

Makes AI art and videos create much faster.

Technical Abstract

ProCache: Constraint-Aware Feature Caching with Selective Computation for Diffusion Transformer Acceleration

Accelerating Diffusion Transformer via Error-Optimized Cache

ETC: training-free diffusion models acceleration with Error-aware Trend Consistency