Score: 1

Object-Aware 4D Human Motion Generation

Published: October 31, 2025 | arXiv ID: 2511.00248v1

By: Shurui Gui , Deep Anil Patel , Xiner Li and more

Potential Business Impact:

Makes videos of people move realistically with objects.

Business Areas:

Motion Capture Media and Entertainment, Video

Recent advances in video diffusion models have enabled the generation of high-quality videos. However, these videos still suffer from unrealistic deformations, semantic violations, and physical inconsistencies that are largely rooted in the absence of 3D physical priors. To address these challenges, we propose an object-aware 4D human motion generation framework grounded in 3D Gaussian representations and motion diffusion priors. With pre-generated 3D humans and objects, our method, Motion Score Distilled Interaction (MSDI), employs the spatial and prompt semantic information in large language models (LLMs) and motion priors through the proposed Motion Diffusion Score Distillation Sampling (MSDS). The combination of MSDS and LLMs enables our spatial-aware motion optimization, which distills score gradients from pre-trained motion diffusion models, to refine human motion while respecting object and semantic constraints. Unlike prior methods requiring joint training on limited interaction datasets, our zero-shot approach avoids retraining and generalizes to out-of-distribution object aware human motions. Experiments demonstrate that our framework produces natural and physically plausible human motions that respect 3D spatial context, offering a scalable solution for realistic 4D generation.

Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding

CV and Pattern Recognition

Makes videos show 3D worlds without flickering.

3 Dec 2025 0

89%

Unconditional Human Motion and Shape Generation via Balanced Score-Based Diffusion

CV and Pattern Recognition

Makes computer-made people move more realistically.

14 Oct 2025 4

89%

Back to Basics: Motion Representation Matters for Human Motion Generation Using Diffusion Model

CV and Pattern Recognition

Makes computer-generated dancing look more real.

4 Dec 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

16 pages

Object-Aware 4D Human Motion Generation

Makes videos of people move realistically with objects.

Technical Abstract

Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding

Unconditional Human Motion and Shape Generation via Balanced Score-Based Diffusion

Back to Basics: Motion Representation Matters for Human Motion Generation Using Diffusion Model