Score: 0

FINE: Factorized multimodal sentiment analysis via mutual INformation Estimation

Published: November 25, 2025 | arXiv ID: 2511.20167v1

By: Yadong Liu, Shangfei Wang

Potential Business Impact:

Helps computers understand feelings from text and pictures.

Business Areas:

Text Analytics Data and Analytics, Software

Multimodal sentiment analysis remains a challenging task due to the inherent heterogeneity across modalities. Such heterogeneity often manifests as asynchronous signals, imbalanced information between modalities, and interference from task-irrelevant noise, hindering the learning of robust and accurate sentiment representations. To address these issues, we propose a factorized multimodal fusion framework that first disentangles each modality into shared and unique representations, and then suppresses task-irrelevant noise within both to retain only sentiment-critical representations. This fine-grained decomposition improves representation quality by reducing redundancy, prompting cross-modal complementarity, and isolating task-relevant sentiment cues. Rather than manipulating the feature space directly, we adopt a mutual information-based optimization strategy to guide the factorization process in a more stable and principled manner. To further support feature extraction and long-term temporal modeling, we introduce two auxiliary modules: a Mixture of Q-Formers, placed before factorization, which precedes the factorization and uses learnable queries to extract fine-grained affective features from multiple modalities, and a Dynamic Contrastive Queue, placed after factorization, which stores latest high-level representations for contrastive learning, enabling the model to capture long-range discriminative patterns and improve class-level separability. Extensive experiments on multiple public datasets demonstrate that our method consistently outperforms existing approaches, validating the effectiveness and robustness of the proposed framework.

Robust Multimodal Sentiment Analysis with Distribution-Based Feature Recovery and Fusion

Computation and Language

Helps computers understand feelings from broken pictures and words.

24 Nov 2025 1

90%

Multi-Modal Opinion Integration for Financial Sentiment Analysis using Cross-Modal Attention

Machine Learning (CS)

Helps predict stock prices by understanding opinions.

3 Dec 2025 0

90%

Senti-iFusion: An Integrity-centered Hierarchical Fusion Framework for Multimodal Sentiment Analysis under Uncertain Modality Missingness

Human-Computer Interaction

Helps computers understand feelings even with missing info.

21 Nov 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

15 pages

FINE: Factorized multimodal sentiment analysis via mutual INformation Estimation

Helps computers understand feelings from text and pictures.

Technical Abstract

Robust Multimodal Sentiment Analysis with Distribution-Based Feature Recovery and Fusion

Multi-Modal Opinion Integration for Financial Sentiment Analysis using Cross-Modal Attention

Senti-iFusion: An Integrity-centered Hierarchical Fusion Framework for Multimodal Sentiment Analysis under Uncertain Modality Missingness