Score: 0

Shape My Moves: Text-Driven Shape-Aware Synthesis of Human Motions

Published: April 4, 2025 | arXiv ID: 2504.03639v1

By: Ting-Hsuan Liao , Yi Zhou , Yu Shen and more

Potential Business Impact:

Makes computer characters move like real people.

Business Areas:
Motion Capture Media and Entertainment, Video

We explore how body shapes influence human motion synthesis, an aspect often overlooked in existing text-to-motion generation methods due to the ease of learning a homogenized, canonical body shape. However, this homogenization can distort the natural correlations between different body shapes and their motion dynamics. Our method addresses this gap by generating body-shape-aware human motions from natural language prompts. We utilize a finite scalar quantization-based variational autoencoder (FSQ-VAE) to quantize motion into discrete tokens and then leverage continuous body shape information to de-quantize these tokens back into continuous, detailed motion. Additionally, we harness the capabilities of a pretrained language model to predict both continuous shape parameters and motion tokens, facilitating the synthesis of text-aligned motions and decoding them into shape-aware motions. We evaluate our method quantitatively and qualitatively, and also conduct a comprehensive perceptual study to demonstrate its efficacy in generating shape-aware motions.

Page Count
12 pages

Category
Computer Science:
CV and Pattern Recognition