Hi-Reco: High-Fidelity Real-Time Conversational Digital Humans
By: Hongbin Huang , Junwei Li , Tianxin Xie and more
Potential Business Impact:
Creates lifelike digital people that talk and react instantly.
High-fidelity digital humans are increasingly used in interactive applications, yet achieving both visual realism and real-time responsiveness remains a major challenge. We present a high-fidelity, real-time conversational digital human system that seamlessly combines a visually realistic 3D avatar, persona-driven expressive speech synthesis, and knowledge-grounded dialogue generation. To support natural and timely interaction, we introduce an asynchronous execution pipeline that coordinates multi-modal components with minimal latency. The system supports advanced features such as wake word detection, emotionally expressive prosody, and highly accurate, context-aware response generation. It leverages novel retrieval-augmented methods, including history augmentation to maintain conversational flow and intent-based routing for efficient knowledge access. Together, these components form an integrated system that enables responsive and believable digital humans, suitable for immersive applications in communication, education, and entertainment.
Similar Papers
Towards Interactive Intelligence for Digital Humans
CV and Pattern Recognition
Makes digital people act and learn like real ones.
Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis
CV and Pattern Recognition
Makes robots move and react like real people.
RealTalk: Realistic Emotion-Aware Lifelike Talking-Head Synthesis
CV and Pattern Recognition
Makes computer faces show real feelings from voices.