MVP: Winning Solution to SMP Challenge 2025 Video Track
By: Liliang Ye , Yunyao Zhang , Yafeng Wu and more
Potential Business Impact:
Predicts which videos will be popular online.
Social media platforms serve as central hubs for content dissemination, opinion expression, and public engagement across diverse modalities. Accurately predicting the popularity of social media videos enables valuable applications in content recommendation, trend detection, and audience engagement. In this paper, we present Multimodal Video Predictor (MVP), our winning solution to the Video Track of the SMP Challenge 2025. MVP constructs expressive post representations by integrating deep video features extracted from pretrained models with user metadata and contextual information. The framework applies systematic preprocessing techniques, including log-transformations and outlier removal, to improve model robustness. A gradient-boosted regression model is trained to capture complex patterns across modalities. Our approach ranked first in the official evaluation of the Video Track, demonstrating its effectiveness and reliability for multimodal video popularity prediction on social platforms. The source code is available at https://anonymous.4open.science/r/SMPDVideo.
Similar Papers
MVP: Enhancing Video Large Language Models via Self-supervised Masked Video Prediction
CV and Pattern Recognition
Teaches computers to understand video time.
MVP: Multimodal Emotion Recognition based on Video and Physiological Signals
CV and Pattern Recognition
Helps computers understand feelings from faces and body signals.
Multi-modal and Metadata Capture Model for Micro Video Popularity Prediction
Multimedia
Predicts which short videos will be popular.