Global Position Aware Group Choreography using Large Language Model
By: Haozhou Pang , Tianwei Ding , Lanshan He and more
Potential Business Impact:
Creates group dances that match music perfectly.
Dance serves as a profound and universal expression of human culture, conveying emotions and stories through movements synchronized with music. Although some current works have achieved satisfactory results in the task of single-person dance generation, the field of multi-person dance generation remains relatively novel. In this work, we present a group choreography framework that leverages recent advancements in Large Language Models (LLM) by modeling the group dance generation problem as a sequence-to-sequence translation task. Our framework consists of a tokenizer that transforms continuous features into discrete tokens, and an LLM that is fine-tuned to predict motion tokens given the audio tokens. We show that by proper tokenization of input modalities and careful design of the LLM training strategies, our framework can generate realistic and diverse group dances while maintaining strong music correlation and dancer-wise consistency. Extensive experiments and evaluations demonstrate that our framework achieves state-of-the-art performance.
Similar Papers
DanceChat: Large Language Model-Guided Music-to-Dance Generation
CV and Pattern Recognition
Makes music turn into cool dance moves.
DanceMosaic: High-Fidelity Dance Generation with Multimodal Editability
Graphics
Creates realistic, editable 3D dances from music and text.
Motion is the Choreographer: Learning Latent Pose Dynamics for Seamless Sign Language Generation
CV and Pattern Recognition
Creates sign language videos of anyone signing anything.