FlexTraj: Image-to-Video Generation with Flexible Point Trajectory Control
By: Zhiyuan Zhang , Can Wang , Dongdong Chen and more
Potential Business Impact:
Makes videos move exactly how you want.
We present FlexTraj, a framework for image-to-video generation with flexible point trajectory control. FlexTraj introduces a unified point-based motion representation that encodes each point with a segmentation ID, a temporally consistent trajectory ID, and an optional color channel for appearance cues, enabling both dense and sparse trajectory control. Instead of injecting trajectory conditions into the video generator through token concatenation or ControlNet, FlexTraj employs an efficient sequence-concatenation scheme that achieves faster convergence, stronger controllability, and more efficient inference, while maintaining robustness under unaligned conditions. To train such a unified point trajectory-controlled video generator, FlexTraj adopts an annealing training strategy that gradually reduces reliance on complete supervision and aligned condition. Experimental results demonstrate that FlexTraj enables multi-granularity, alignment-agnostic trajectory control for video generation, supporting various applications such as motion cloning, drag-based image-to-video, motion interpolation, camera redirection, flexible action control and mesh animations.
Similar Papers
PoseTraj: Pose-Aware Trajectory Control in Video Diffusion
CV and Pattern Recognition
Makes videos move objects realistically in 3D.
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
CV and Pattern Recognition
Makes videos follow your drawn paths perfectly.
Generative Video Motion Editing with 3D Point Tracks
CV and Pattern Recognition
Edits videos by changing how things move.