Score: 0

Computational emotion analysis with multimodal LLMs: Current evidence on an emerging methodological opportunity

Published: December 11, 2025 | arXiv ID: 2512.10882v1

By: Hauke Licht

Potential Business Impact:

AI can't reliably tell emotions in real speeches.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Emotions are central to politics and analyzing their role in political communication has a long tradition. As research increasingly leverages audio-visual materials to analyze the display of emotions, the emergence of multimodal generative AI promises great advances. However, we lack evidence about the effectiveness of multimodal AI in emotion analysis. This paper addresses this gap by evaluating current multimodal large language models (mLLMs) in video-based analysis of emotional arousal in two complementary data sets of human-labeled video recordings. I find that under ideal circumstances, mLLMs' emotional arousal ratings are highly reliable and show little to know indication of demographic bias. However, in recordings of speakers in real-world parliamentary debates, mLLMs' arousal ratings fail to deliver on this promise with potential negative consequences for downstream statistical inferences. This study therefore underscores the need for continued, thorough evaluation of emerging generative AI methods in political analysis and contributes a suitable replicable framework.

Country of Origin
🇦🇹 Austria

Page Count
58 pages

Category
Computer Science:
Computation and Language