Animus3D: Text-driven 3D Animation via Motion Score Distillation
By: Qi Sun , Can Wang , Jiaxiang Shang and more
Potential Business Impact:
Makes 3D models move like you describe.
We present Animus3D, a text-driven 3D animation framework that generates motion field given a static 3D asset and text prompt. Previous methods mostly leverage the vanilla Score Distillation Sampling (SDS) objective to distill motion from pretrained text-to-video diffusion, leading to animations with minimal movement or noticeable jitter. To address this, our approach introduces a novel SDS alternative, Motion Score Distillation (MSD). Specifically, we introduce a LoRA-enhanced video diffusion model that defines a static source distribution rather than pure noise as in SDS, while another inversion-based noise estimation technique ensures appearance preservation when guiding motion. To further improve motion fidelity, we incorporate explicit temporal and spatial regularization terms that mitigate geometric distortions across time and space. Additionally, we propose a motion refinement module to upscale the temporal resolution and enhance fine-grained details, overcoming the fixed-resolution constraints of the underlying video model. Extensive experiments demonstrate that Animus3D successfully animates static 3D assets from diverse text prompts, generating significantly more substantial and detailed motion than state-of-the-art baselines while maintaining high visual integrity. Code will be released at https://qiisun.github.io/animus3d_page.
Similar Papers
Object-Aware 4D Human Motion Generation
CV and Pattern Recognition
Makes videos of people move realistically with objects.
Text-based Animatable 3D Avatars with Morphable Model Alignment
CV and Pattern Recognition
Creates realistic talking 3D heads from text.
SketchAnimator: Animate Sketch via Motion Customization of Text-to-Video Diffusion Models
CV and Pattern Recognition
Makes drawings move like a video.