ADMC: Attention-based Diffusion Model for Missing Modalities Feature Completion
By: Wei Zhang , Juan Chen , Yanbo J. Wang and more
Potential Business Impact:
Helps computers understand feelings even with missing clues.
Multimodal emotion and intent recognition is essential for automated human-computer interaction, It aims to analyze users' speech, text, and visual information to predict their emotions or intent. One of the significant challenges is that missing modalities due to sensor malfunctions or incomplete data. Traditional methods that attempt to reconstruct missing information often suffer from over-coupling and imprecise generation processes, leading to suboptimal outcomes. To address these issues, we introduce an Attention-based Diffusion model for Missing Modalities feature Completion (ADMC). Our framework independently trains feature extraction networks for each modality, preserving their unique characteristics and avoiding over-coupling. The Attention-based Diffusion Network (ADN) generates missing modality features that closely align with authentic multimodal distribution, enhancing performance across all missing-modality scenarios. Moreover, ADN's cross-modal generation offers improved recognition even in full-modality contexts. Our approach achieves state-of-the-art results on the IEMOCAP and MIntRec benchmarks, demonstrating its effectiveness in both missing and complete modality scenarios.
Similar Papers
AMM-Diff: Adaptive Multi-Modality Diffusion Network for Missing Modality Imputation
CV and Pattern Recognition
Creates missing medical scans from available ones.
MODA: MOdular Duplex Attention for Multimodal Perception, Cognition, and Emotion Understanding
CV and Pattern Recognition
Helps computers understand pictures and words better.
No Modality Left Behind: Adapting to Missing Modalities via Knowledge Distillation for Brain Tumor Segmentation
CV and Pattern Recognition
Helps doctors find brain tumors even with missing scans.