AIVA: An AI-based Virtual Companion for Emotion-aware Interaction
By: Chenxi Li
Potential Business Impact:
AI understands your feelings to talk and act better.
Recent advances in Large Language Models (LLMs) have significantly improved natural language understanding and generation, enhancing Human-Computer Interaction (HCI). However, LLMs are limited to unimodal text processing and lack the ability to interpret emotional cues from non-verbal signals, hindering more immersive and empathetic interactions. This work explores integrating multimodal sentiment perception into LLMs to create emotion-aware agents. We propose \ours, an AI-based virtual companion that captures multimodal sentiment cues, enabling emotionally aligned and animated HCI. \ours introduces a Multimodal Sentiment Perception Network (MSPN) using a cross-modal fusion transformer and supervised contrastive learning to provide emotional cues. Additionally, we develop an emotion-aware prompt engineering strategy for generating empathetic responses and integrate a Text-to-Speech (TTS) system and animated avatar module for expressive interactions. \ours provides a framework for emotion-aware agents with applications in companion robotics, social care, mental health, and human-centered AI.
Similar Papers
Agent-Based Modular Learning for Multimodal Emotion Recognition in Human-Agent Systems
Machine Learning (CS)
Helps computers understand feelings from faces, voices, words.
AI shares emotion with humans across languages and cultures
Computation and Language
AI understands and shows feelings like people.
Computational emotion analysis with multimodal LLMs: Current evidence on an emerging methodological opportunity
Computation and Language
AI can't reliably tell emotions in real speeches.