MARS: Modality-Aligned Retrieval for Sequence Augmented CTR Prediction
By: Yutian Xiao , Shukuan Wang , Binhao Wang and more
Potential Business Impact:
Helps recommend things to less active users.
Click-through rate (CTR) prediction serves as a cornerstone of recommender systems. Despite the strong performance of current CTR models based on user behavior modeling, they are still severely limited by interaction sparsity, especially in low-active user scenarios. To address this issue, data augmentation of user behavior is a promising research direction. However, existing data augmentation methods heavily rely on collaborative signals while overlooking the rich multimodal features of items, leading to insufficient modeling of low-active users. To alleviate this problem, we propose a novel framework \textbf{MARS} (\textbf{M}odality-\textbf{A}ligned \textbf{R}etrieval for \textbf{S}equence Augmented CTR Prediction). MARS utilizes a Stein kernel-based approach to align text and image features into a unified and unbiased semantic space to construct multimodal user embeddings. Subsequently, each low-active user's behavior sequence is augmented by retrieving, filtering, and concentrating the most similar behavior sequence of high-active users via multimodal user embeddings. Validated by extensive offline experiments and online A/B tests, our framework MARS consistently outperforms state-of-the-art baselines and achieves substantial growth on core business metrics within Kuaishou~\footnote{https://www.kuaishou.com/}. Consequently, MARS has been successfully deployed, serving the main traffic for hundreds of millions of users. To ensure reproducibility, we provide anonymous access to the implementation code~\footnote{https://github.com/wangshukuan/MARS}.
Similar Papers
SaviorRec: Semantic-Behavior Alignment for Cold-Start Recommendation
Information Retrieval
Helps online stores show you better stuff.
Quadratic Interest Network for Multimodal Click-Through Rate Prediction
Information Retrieval
Helps websites show you things you'll like.
1$^{st}$ Place Solution of WWW 2025 EReL@MIR Workshop Multimodal CTR Prediction Challenge
Information Retrieval
Helps websites show you things you'll like.