A Remarkably Efficient Paradigm to Multimodal Large Language Models for Sequential Recommendation
By: Qiyong Zhong , Jiajie Su , Ming Yang and more
Potential Business Impact:
Makes online shopping suggestions much faster.
In this paper, we proposed Speeder, a remarkably efficient paradigm to multimodal large language models for sequential recommendation. Speeder introduces 3 key components: (1) Multimodal Representation Compression (MRC), which efficiently reduces redundancy in item descriptions; (2) Sequential Position Awareness Enhancement (SPAE), which strengthens the model's ability to capture complex sequential dependencies; (3) Modality-aware Progressive Optimization (MPO), which progressively integrates different modalities to improve the model's understanding and reduce cognitive biases. Through extensive experiments, Speeder demonstrates superior performance over baselines in terms of VHR@1 and computational efficiency. Specifically, Speeder achieved 250% of the training speed and 400% of the inference speed compared to the state-of-the-art MLLM-based SR models. Future work could focus on incorporating real-time feedback from real-world systems.
Similar Papers
A Remarkably Efficient Paradigm to Multimodal Large Language Models for Sequential Recommendation
Information Retrieval
Makes online shopping suggestions faster and smarter.
Empowering Large Language Model for Sequential Recommendation via Multimodal Embeddings and Semantic IDs
Information Retrieval
Helps online stores show you better stuff.
LEMUR: Large scale End-to-end MUltimodal Recommendation
Information Retrieval
Helps apps show you better stuff you'll like.