Semantic Item Graph Enhancement for Multimodal Recommendation
By: Xiaoxiong Zhang , Xin Zhou , Zhiwei Zeng and more
Potential Business Impact:
Helps online stores show you better stuff.
Multimodal recommendation systems have attracted increasing attention for their improved performance by leveraging items' multimodal information. Prior methods often build modality-specific item-item semantic graphs from raw modality features and use them as supplementary structures alongside the user-item interaction graph to enhance user preference learning. However, these semantic graphs suffer from semantic deficiencies, including (1) insufficient modeling of collaborative signals among items and (2) structural distortions introduced by noise in raw modality features, ultimately compromising performance. To address these issues, we first extract collaborative signals from the interaction graph and infuse them into each modality-specific item semantic graph to enhance semantic modeling. Then, we design a modulus-based personalized embedding perturbation mechanism that injects perturbations with modulus-guided personalized intensity into embeddings to generate contrastive views. This enables the model to learn noise-robust representations through contrastive learning, thereby reducing the effect of structural noise in semantic graphs. Besides, we propose a dual representation alignment mechanism that first aligns multiple semantic representations via a designed Anchor-based InfoNCE loss using behavior representations as anchors, and then aligns behavior representations with the fused semantics by standard InfoNCE, to ensure representation consistency. Extensive experiments on four benchmark datasets validate the effectiveness of our framework.
Similar Papers
Causal Inspired Multi Modal Recommendation
Information Retrieval
Fixes online shopping picks by ignoring fake trends.
Knowledge graph-based personalized multimodal recommendation fusion framework
Information Retrieval
Helps computers understand what you like better.
MLLMRec: Exploring the Potential of Multimodal Large Language Models in Recommender Systems
Information Retrieval
Suggests better movies and products you'll like.