Score: 0

EditEmoTalk: Controllable Speech-Driven 3D Facial Animation with Continuous Expression Editing

Published: January 15, 2026 | arXiv ID: 2601.10000v1

By: Diqiong Jiang , Kai Zhu , Dan Song and more

Speech-driven 3D facial animation aims to generate realistic and expressive facial motions directly from audio. While recent methods achieve high-quality lip synchronization, they often rely on discrete emotion categories, limiting continuous and fine-grained emotional control. We present EditEmoTalk, a controllable speech-driven 3D facial animation framework with continuous emotion editing. The key idea is a boundary-aware semantic embedding that learns the normal directions of inter-emotion decision boundaries, enabling a continuous expression manifold for smooth emotion manipulation. Moreover, we introduce an emotional consistency loss that enforces semantic alignment between the generated motion dynamics and the target emotion embedding through a mapping network, ensuring faithful emotional expression. Extensive experiments demonstrate that EditEmoTalk achieves superior controllability, expressiveness, and generalization while maintaining accurate lip synchronization. Code and pretrained models will be released.

MEDTalk: Multimodal Controlled 3D Facial Animation with Dynamic Emotions by Disentangled Embedding

CV and Pattern Recognition

Makes talking faces show real feelings from sound.

8 Jul 2025 0

92%

MEDTalk: Multimodal Controlled 3D Facial Animation with Dynamic Emotions by Disentangled Embedding

CV and Pattern Recognition

Makes talking faces show real feelings from sound.

8 Jul 2025 1

92%

MEDTalk: Multimodal Controlled 3D Facial Animation with Dynamic Emotions by Disentangled Embedding

CV and Pattern Recognition

Makes talking faces show real feelings from sound.

8 Jul 2025 1

View PDF Login to Bookmark

EditEmoTalk: Controllable Speech-Driven 3D Facial Animation with Continuous Expression Editing

Technical Abstract

MEDTalk: Multimodal Controlled 3D Facial Animation with Dynamic Emotions by Disentangled Embedding

MEDTalk: Multimodal Controlled 3D Facial Animation with Dynamic Emotions by Disentangled Embedding

MEDTalk: Multimodal Controlled 3D Facial Animation with Dynamic Emotions by Disentangled Embedding