Score: 1

Hierarchical MoE: Continuous Multimodal Emotion Recognition with Incomplete and Asynchronous Inputs

Published: August 4, 2025 | arXiv ID: 2508.02133v4

By: Yitong Zhu , Lei Han , Guanxuan Jiang and more

Potential Business Impact:

Lets computers understand feelings even when information is missing.

Multimodal emotion recognition (MER) is crucial for human-computer interaction, yet real-world challenges like dynamic modality incompleteness and asynchrony severely limit its robustness. Existing methods often assume consistently complete data or lack dynamic adaptability. To address these limitations, we propose a novel Hi-MoE~(Hierarchical Mixture-of-Experts) framework for robust continuous emotion prediction. This framework employs a dual-layer expert structure. A Modality Expert Bank utilizes soft routing to dynamically handle missing modalities and achieve robust information fusion. A subsequent Emotion Expert Bank leverages differential-attention routing to flexibly attend to emotional prototypes, enabling fine-grained emotion representation. Additionally, a cross-modal alignment module explicitly addresses temporal shifts and semantic inconsistencies between modalities. Extensive experiments on benchmark datasets DEAP and DREAMER demonstrate our model's state-of-the-art performance in continuous emotion regression, showcasing exceptional robustness under challenging conditions such as dynamic modality absence and asynchronous sampling. This research significantly advances the development of intelligent emotion systems adaptable to complex real-world environments.

Hierarchical MoE: Continuous Multimodal Emotion Recognition with Incomplete and Asynchronous Inputs

Human-Computer Interaction

Helps computers understand feelings even when data is missing.

4 Aug 2025 1

90%

Pioneering Multimodal Emotion Recognition in the Era of Large Models: From Closed Sets to Open Vocabularies

Human-Computer Interaction

Helps computers understand feelings from voices and faces.

24 Dec 2025 1

90%

More Is Better: A MoE-Based Emotion Recognition Framework with Human Preference Alignment

CV and Pattern Recognition

Helps computers understand emotions better from faces.

8 Aug 2025 1

View PDF Login to Bookmark

Page Count

11 pages

Hierarchical MoE: Continuous Multimodal Emotion Recognition with Incomplete and Asynchronous Inputs

Lets computers understand feelings even when information is missing.

Technical Abstract

Hierarchical MoE: Continuous Multimodal Emotion Recognition with Incomplete and Asynchronous Inputs

Pioneering Multimodal Emotion Recognition in the Era of Large Models: From Closed Sets to Open Vocabularies

More Is Better: A MoE-Based Emotion Recognition Framework with Human Preference Alignment