DreamStyle: A Unified Framework for Video Stylization
By: Mengtian Li , Jinshu Chen , Songtao Zhao and more
Potential Business Impact:
Changes videos to match any style you want.
Video stylization, an important downstream task of video generation models, has not yet been thoroughly explored. Its input style conditions typically include text, style image, and stylized first frame. Each condition has a characteristic advantage: text is more flexible, style image provides a more accurate visual anchor, and stylized first frame makes long-video stylization feasible. However, existing methods are largely confined to a single type of style condition, which limits their scope of application. Additionally, their lack of high-quality datasets leads to style inconsistency and temporal flicker. To address these limitations, we introduce DreamStyle, a unified framework for video stylization, supporting (1) text-guided, (2) style-image-guided, and (3) first-frame-guided video stylization, accompanied by a well-designed data curation pipeline to acquire high-quality paired video data. DreamStyle is built on a vanilla Image-to-Video (I2V) model and trained using a Low-Rank Adaptation (LoRA) with token-specific up matrices that reduces the confusion among different condition tokens. Both qualitative and quantitative evaluations demonstrate that DreamStyle is competent in all three video stylization tasks, and outperforms the competitors in style consistency and video quality.
Similar Papers
DreamVE: Unified Instruction-based Image and Video Editing
CV and Pattern Recognition
Changes videos with simple text instructions.
UTDesign: A Unified Framework for Stylized Text Editing and Generation in Graphic Design Images
CV and Pattern Recognition
Makes computers write and design text better.
V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents
CV and Pattern Recognition
Changes videos to any art style you want.