Score: 0

MG-MotionLLM: A Unified Framework for Motion Comprehension and Generation across Multiple Granularities

Published: April 3, 2025 | arXiv ID: 2504.02478v1

By: Bizhu Wu , Jinheng Xie , Keming Shen and more

Potential Business Impact:

Teaches computers to understand and create detailed body movements.

Business Areas:

Motion Capture Media and Entertainment, Video

Recent motion-aware large language models have demonstrated promising potential in unifying motion comprehension and generation. However, existing approaches primarily focus on coarse-grained motion-text modeling, where text describes the overall semantics of an entire motion sequence in just a few words. This limits their ability to handle fine-grained motion-relevant tasks, such as understanding and controlling the movements of specific body parts. To overcome this limitation, we pioneer MG-MotionLLM, a unified motion-language model for multi-granular motion comprehension and generation. We further introduce a comprehensive multi-granularity training scheme by incorporating a set of novel auxiliary tasks, such as localizing temporal boundaries of motion segments via detailed text as well as motion detailed captioning, to facilitate mutual reinforcement for motion-text modeling across various levels of granularity. Extensive experiments show that our MG-MotionLLM achieves superior performance on classical text-to-motion and motion-to-text tasks, and exhibits potential in novel fine-grained motion comprehension and editing tasks. Project page: CVI-SZU/MG-MotionLLM

IRG-MotionLLM: Interleaving Motion Generation, Assessment and Refinement for Text-to-Motion Generation

CV and Pattern Recognition

Makes computer-made movements look more real.

11 Dec 2025 1

90%

Unlocking Pretrained LLMs for Motion-Related Multimodal Generation: A Fine-Tuning Approach to Unify Diffusion and Next-Token Prediction

Machine Learning (CS)

Creates realistic character movements from text.

8 Mar 2025 1

89%

Multimodal Generative AI with Autoregressive LLMs for Human Motion Understanding and Generation: A Way Forward

CV and Pattern Recognition

Makes computers create realistic human movements from words.

31 May 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

20 pages

MG-MotionLLM: A Unified Framework for Motion Comprehension and Generation across Multiple Granularities

Teaches computers to understand and create detailed body movements.

Technical Abstract

IRG-MotionLLM: Interleaving Motion Generation, Assessment and Refinement for Text-to-Motion Generation

Unlocking Pretrained LLMs for Motion-Related Multimodal Generation: A Fine-Tuning Approach to Unify Diffusion and Next-Token Prediction

Multimodal Generative AI with Autoregressive LLMs for Human Motion Understanding and Generation: A Way Forward