DLRREC: Denoising Latent Representations via Multi-Modal Knowledge Fusion in Deep Recommender Systems
By: Jiahao Tian, Zhenkai Wang
Potential Business Impact:
Makes movie suggestions much smarter.
Modern recommender systems struggle to effectively utilize the rich, yet high-dimensional and noisy, multi-modal features generated by Large Language Models (LLMs). Treating these features as static inputs decouples them from the core recommendation task. We address this limitation with a novel framework built on a key insight: deeply fusing multi-modal and collaborative knowledge for representation denoising. Our unified architecture introduces two primary technical innovations. First, we integrate dimensionality reduction directly into the recommendation model, enabling end-to-end co-training that makes the reduction process aware of the final ranking objective. Second, we introduce a contrastive learning objective that explicitly incorporates the collaborative filtering signal into the latent space. This synergistic process refines raw LLM embeddings, filtering noise while amplifying task-relevant signals. Extensive experiments confirm our method's superior discriminative power, proving that this integrated fusion and denoising strategy is critical for achieving state-of-the-art performance. Our work provides a foundational paradigm for effectively harnessing LLMs in recommender systems.
Similar Papers
MLLMRec: Exploring the Potential of Multimodal Large Language Models in Recommender Systems
Information Retrieval
Suggests better movies and products you'll like.
Multimodal Large Language Models with Adaptive Preference Optimization for Sequential Recommendation
Information Retrieval
Helps computers pick what you'll like better.
Learning Item Representations Directly from Multimodal Features for Effective Recommendation
Information Retrieval
Shows you better stuff you might like.