Graph-based Interaction Augmentation Network for Robust Multimodal Sentiment Analysis
By: Hu Zhangfeng, Shi mengxin
Potential Business Impact:
Helps computers understand feelings from messy videos.
The inevitable modality imperfection in real-world scenarios poses significant challenges for Multimodal Sentiment Analysis (MSA). While existing methods tailor reconstruction or joint representation learning strategies to restore missing semantics, they often overlook complex dependencies within and across modalities. Consequently, they fail to fully leverage available modalities to capture complementary semantics. To this end, this paper proposes a novel graph-based framework to exploit both intra- and inter-modality interactions, enabling imperfect samples to derive missing semantics from complementary parts for robust MSA. Specifically, we first devise a learnable hypergraph to model intra-modality temporal dependencies to exploit contextual information within each modality. Then, a directed graph is employed to explore inter-modality correlations based on attention mechanism, capturing complementary information across different modalities. Finally, the knowledge from perfect samples is integrated to supervise our interaction processes, guiding the model toward learning reliable and robust joint representations. Extensive experiments on MOSI and MOSEI datasets demonstrate the effectiveness of our method.
Similar Papers
Unsupervised Multimodal Graph-based Model for Geo-social Analysis
Social and Information Networks
Finds important news in social media posts.
Structures Meet Semantics: Multimodal Fusion via Graph Contrastive Learning
CV and Pattern Recognition
Helps computers understand feelings from voice, face, and words.
Disentangling Bias by Modeling Intra- and Inter-modal Causal Attention for Multimodal Sentiment Analysis
Machine Learning (CS)
Helps computers understand feelings better, not just tricks.