Senti-iFusion: An Integrity-centered Hierarchical Fusion Framework for Multimodal Sentiment Analysis under Uncertain Modality Missingness
By: Liling Li , Guoyang Xu , Xiongri Shen and more
Potential Business Impact:
Helps computers understand feelings even with missing info.
Multimodal Sentiment Analysis (MSA) is critical for human-computer interaction but faces challenges when the modalities are incomplete or missing. Existing methods often assume pre-defined missing modalities or fixed missing rates, limiting their real-world applicability. To address this challenge, we propose Senti-iFusion, an integrity-centered hierarchical fusion framework capable of handling both inter- and intra-modality missingness simultaneously. It comprises three hierarchical components: Integrity Estimation, Integrity-weighted Completion, and Integrity-guided Fusion. First, the Integrity Estimation module predicts the completeness of each modality and mitigates the noise caused by incomplete data. Second, the Integrity-weighted Cross-modal Completion module employs a novel weighting mechanism to disentangle consistent semantic structures from modality-specific representations, enabling the precise recovery of sentiment-related features across language, acoustic, and visual modalities. To ensure consistency in reconstruction, a dual-depth validation with semantic- and feature-level losses ensures consistent reconstruction at both fine-grained (low-level) and semantic (high-level) scales. Finally, the Integrity-guided Adaptive Fusion mechanism dynamically selects the dominant modality for attention-based fusion, ensuring that the most reliable modality, based on completeness and quality, contributes more significantly to the final prediction. Senti-iFusion employs a progressive training approach to ensure stable convergence. Experimental results on popular MSA datasets demonstrate that Senti-iFusion outperforms existing methods, particularly in fine-grained sentiment analysis tasks. The code and our proposed Senti-iFusion model will be publicly available.
Similar Papers
PSA-MF: Personality-Sentiment Aligned Multi-Level Fusion for Multimodal Sentiment Analysis
Multimedia
Helps computers understand feelings from faces, voices, words.
FINE: Factorized multimodal sentiment analysis via mutual INformation Estimation
Multimedia
Helps computers understand feelings from text and pictures.
Rethinking Multimodal Sentiment Analysis: A High-Accuracy, Simplified Fusion Architecture
Computation and Language
Helps computers understand feelings from talking, seeing, and hearing.