Score: 1

TaoCache: Structure-Maintained Video Generation Acceleration

Published: August 12, 2025 | arXiv ID: 2508.08978v1

By: Zhentao Fan, Zongzuo Wang, Weiwei Zhang

Potential Business Impact:

Makes AI art faster without losing quality.

Existing cache-based acceleration methods for video diffusion models primarily skip early or mid denoising steps, which often leads to structural discrepancies relative to full-timestep generation and can hinder instruction following and character consistency. We present TaoCache, a training-free, plug-and-play caching strategy that, instead of residual-based caching, adopts a fixed-point perspective to predict the model's noise output and is specifically effective in late denoising stages. By calibrating cosine similarities and norm ratios of consecutive noise deltas, TaoCache preserves high-resolution structure while enabling aggressive skipping. The approach is orthogonal to complementary accelerations such as Pyramid Attention Broadcast (PAB) and TeaCache, and it integrates seamlessly into DiT-based frameworks. Across Latte-1, OpenSora-Plan v110, and Wan2.1, TaoCache attains substantially higher visual quality (LPIPS, SSIM, PSNR) than prior caching methods under the same speedups.

Country of Origin
🇨🇦 Canada

Page Count
17 pages

Category
Computer Science:
CV and Pattern Recognition