Score: 0

DyFuLM: An Advanced Multimodal Framework for Sentiment Analysis

Published: December 1, 2025 | arXiv ID: 2512.01410v1

By: Ruohan Zhou , Jiachen Yuan , Churui Yang and more

Potential Business Impact:

Helps computers understand feelings in writing better.

Business Areas:

Text Analytics Data and Analytics, Software

Understanding sentiment in complex textual expressions remains a fundamental challenge in affective computing. To address this, we propose a Dynamic Fusion Learning Model (DyFuLM), a multimodal framework designed to capture both hierarchical semantic representations and fine-grained emotional nuances. DyFuLM introduces two key moodules: a Hierarchical Dynamic Fusion module that adaptively integrates multi-level features, and a Gated Feature Aggregation module that regulates cross-layer information ffow to achieve balanced representation learning. Comprehensive experiments on multi-task sentiment datasets demonstrate that DyFuLM achieves 82.64% coarse-grained and 68.48% fine-grained accuracy, yielding the lowest regression errors (MAE = 0.0674, MSE = 0.0082) and the highest R^2 coefficient of determination (R^2= 0.6903). Furthermore, the ablation study validates the effectiveness of each module in DyFuLM. When all modules are removed, the accuracy drops by 0.91% for coarse-grained and 0.68% for fine-grained tasks. Keeping only the gated fusion module causes decreases of 0.75% and 0.55%, while removing the dynamic loss mechanism results in drops of 0.78% and 0.26% for coarse-grained and fine-grained sentiment classification, respectively. These results demonstrate that each module contributes significantly to feature interaction and task balance. Overall, the experimental findings further validate that DyFuLM enhances sentiment representation and overall performance through effective hierarchical feature fusion.

A Unified Framework for Emotion Recognition and Sentiment Analysis via Expert-Guided Multimodal Fusion with Large Language Models

Computation and Language

**Computers understand feelings from talking, seeing, and writing.**

12 Jan 2026 1

90%

FINE: Factorized multimodal sentiment analysis via mutual INformation Estimation

Multimedia

Helps computers understand feelings from text and pictures.

25 Nov 2025 0

89%

Rethinking Multimodal Sentiment Analysis: A High-Accuracy, Simplified Fusion Architecture

Computation and Language

Helps computers understand feelings from talking, seeing, and hearing.

5 May 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

8 pages

DyFuLM: An Advanced Multimodal Framework for Sentiment Analysis

Helps computers understand feelings in writing better.

Technical Abstract

A Unified Framework for Emotion Recognition and Sentiment Analysis via Expert-Guided Multimodal Fusion with Large Language Models

FINE: Factorized multimodal sentiment analysis via mutual INformation Estimation

Rethinking Multimodal Sentiment Analysis: A High-Accuracy, Simplified Fusion Architecture