1$^{st}$ Place Solution of WWW 2025 EReL@MIR Workshop Multimodal CTR Prediction Challenge
By: Junwei Xu , Zehao Zhao , Xiaoyu Hu and more
Potential Business Impact:
Helps websites show you things you'll like.
The WWW 2025 EReL@MIR Workshop Multimodal CTR Prediction Challenge focuses on effectively applying multimodal embedding features to improve click-through rate (CTR) prediction in recommender systems. This technical report presents our 1$^{st}$ place winning solution for Task 2, combining sequential modeling and feature interaction learning to effectively capture user-item interactions. For multimodal information integration, we simply append the frozen multimodal embeddings to each item embedding. Experiments on the challenge dataset demonstrate the effectiveness of our method, achieving superior performance with a 0.9839 AUC on the leaderboard, much higher than the baseline model. Code and configuration are available in our GitHub repository and the checkpoint of our model can be found in HuggingFace.
Similar Papers
Feature Fusion Revisited: Multimodal CTR Prediction for MMCTR Challenge
Information Retrieval
Makes online ads show up faster and better.
Quadratic Interest Network for Multimodal Click-Through Rate Prediction
Information Retrieval
Helps websites show you things you'll like.
MIM: Multi-modal Content Interest Modeling Paradigm for User Behavior Modeling
Information Retrieval
Shows ads people actually want to click.