Cross-Modal Prototype Augmentation and Dual-Grained Prompt Learning for Social Media Popularity Prediction
By: Ao Zhou , Mingsheng Tu , Luping Wang and more
Potential Business Impact:
Predicts social media post popularity better.
Social Media Popularity Prediction is a complex multimodal task that requires effective integration of images, text, and structured information. However, current approaches suffer from inadequate visual-textual alignment and fail to capture the inherent cross-content correlations and hierarchical patterns in social media data. To overcome these limitations, we establish a multi-class framework , introducing hierarchical prototypes for structural enhancement and contrastive learning for improved vision-text alignment. Furthermore, we propose a feature-enhanced framework integrating dual-grained prompt learning and cross-modal attention mechanisms, achieving precise multimodal representation through fine-grained category modeling. Experimental results demonstrate state-of-the-art performance on benchmark metrics, establishing new reference standards for multimodal social media analysis.
Similar Papers
Causal Inspired Multi Modal Recommendation
Information Retrieval
Fixes online shopping picks by ignoring fake trends.
Knowledge graph-based personalized multimodal recommendation fusion framework
Information Retrieval
Helps computers understand what you like better.
Towards Multimodal Sentiment Analysis via Contrastive Cross-modal Retrieval Augmentation and Hierachical Prompts
Multimedia
Helps computers understand feelings from text and pictures.