Cognitive-Inspired Hierarchical Attention Fusion With Visual and Textual for Cross-Domain Sequential Recommendation
By: Wangyu Wu , Zhenhong Chen , Siqi Song and more
Potential Business Impact:
Recommends products you'll like from other stores.
Cross-Domain Sequential Recommendation (CDSR) predicts user behavior by leveraging historical interactions across multiple domains, focusing on modeling cross-domain preferences through intra- and inter-sequence item relationships. Inspired by human cognitive processes, we propose Hierarchical Attention Fusion of Visual and Textual Representations (HAF-VT), a novel approach integrating visual and textual data to enhance cognitive modeling. Using the frozen CLIP model, we generate image and text embeddings, enriching item representations with multimodal data. A hierarchical attention mechanism jointly learns single-domain and cross-domain preferences, mimicking human information integration. Evaluated on four e-commerce datasets, HAF-VT outperforms existing methods in capturing cross-domain user interests, bridging cognitive principles with computational models and highlighting the role of multimodal data in sequential decision-making.
Similar Papers
Align-for-Fusion: Harmonizing Triple Preferences via Dual-oriented Diffusion for Cross-domain Sequential Recommendation
Information Retrieval
Helps online stores guess what you'll buy next.
Revisiting Self-attention for Cross-domain Sequential Recommendation
Information Retrieval
Helps online stores suggest better items.
Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders
Information Retrieval
Makes online suggestions faster with long histories.