Beat on Gaze: Learning Stylized Generation of Gaze and Head Dynamics
By: Chengwei Shi , Chong Cao , Xin Tong and more
Potential Business Impact:
Makes cartoon characters look and talk naturally.
Head and gaze dynamics are crucial in expressive 3D facial animation for conveying emotion and intention. However, existing methods frequently address facial components in isolation, overlooking the intricate coordination between gaze, head motion, and speech. The scarcity of high-quality gaze-annotated datasets hinders the development of data-driven models capable of capturing realistic, personalized gaze control. To address these challenges, we propose StyGazeTalk, an audio-driven method that generates synchronized gaze and head motion styles. We extract speaker-specific motion traits from gaze-head sequences with a multi-layer LSTM structure incorporating a style encoder, enabling the generation of diverse animation styles. We also introduce a high-precision multimodal dataset comprising eye-tracked gaze, audio, head pose, and 3D facial parameters, providing a valuable resource for training and evaluating head and gaze control models. Experimental results demonstrate that our method generates realistic, temporally coherent, and style-aware head-gaze motions, significantly advancing the state-of-the-art in audio-driven facial animation.
Similar Papers
Model See Model Do: Speech-Driven Facial Animation with Style Control
Graphics
Makes cartoon faces talk and show feelings.
MEDTalk: Multimodal Controlled 3D Facial Animation with Dynamic Emotions by Disentangled Embedding
CV and Pattern Recognition
Makes talking faces show real feelings from sound.
StyleSpeaker: Audio-Enhanced Fine-Grained Style Modeling for Speech-Driven 3D Facial Animation
Multimedia
Makes talking faces move realistically for any person.