LongDiff: Training-Free Long Video Generation in One Go
By: Zhuoling Li , Hossein Rahmani , Qiuhong Ke and more
Potential Business Impact:
Makes short video tools create long videos.
Video diffusion models have recently achieved remarkable results in video generation. Despite their encouraging performance, most of these models are mainly designed and trained for short video generation, leading to challenges in maintaining temporal consistency and visual details in long video generation. In this paper, we propose LongDiff, a novel training-free method consisting of carefully designed components \ -- Position Mapping (PM) and Informative Frame Selection (IFS) \ -- to tackle two key challenges that hinder short-to-long video generation generalization: temporal position ambiguity and information dilution. Our LongDiff unlocks the potential of off-the-shelf video diffusion models to achieve high-quality long video generation in one go. Extensive experiments demonstrate the efficacy of our method.
Similar Papers
VideoMerge: Towards Training-free Long Video Generation
CV and Pattern Recognition
Makes short videos longer without losing quality.
LumosFlow: Motion-Guided Long Video Generation
CV and Pattern Recognition
Makes long videos move smoothly and naturally.
DiffuseSlide: Training-Free High Frame Rate Video Generation Diffusion
CV and Pattern Recognition
Makes slow videos look super smooth and fast.