HSEmotion Team at ABAW-8 Competition: Audiovisual Ambivalence/Hesitancy, Emotional Mimicry Intensity and Facial Expression Recognition
By: Andrey V. Savchenko
Potential Business Impact:
Helps computers understand emotions from faces, voices, and words.
This article presents our results for the eighth Affective Behavior Analysis in-the-Wild (ABAW) competition. We combine facial emotional descriptors extracted by pre-trained models, namely, our EmotiEffLib library, with acoustic features and embeddings of texts recognized from speech. The frame-level features are aggregated and fed into simple classifiers, e.g., multi-layered perceptron (feed-forward neural network with one hidden layer), to predict ambivalence/hesitancy and facial expressions. In the latter case, we also use the pre-trained facial expression recognition model to select high-score video frames and prevent their processing with a domain-specific video classifier. The video-level prediction of emotional mimicry intensity is implemented by simply aggregating frame-level features and training a multi-layered perceptron. Experimental results for three tasks from the ABAW challenge demonstrate that our approach significantly increases validation metrics compared to existing baselines.
Similar Papers
Semantic Matters: Multimodal Features for Affective Analysis
CV and Pattern Recognition
Helps computers understand emotions from voice and face.
Feature-Based Dual Visual Feature Extraction Model for Compound Multimodal Emotion Recognition
CV and Pattern Recognition
Helps computers understand emotions from faces and voices.
Interactive Multimodal Fusion with Temporal Modeling
CV and Pattern Recognition
Lets computers guess your feelings from faces and voices.