Score: 0

BlossomRec: Block-level Fused Sparse Attention Mechanism for Sequential Recommendations

Published: December 15, 2025 | arXiv ID: 2512.13368v1

By: Mengyang Ma , Xiaopeng Li , Wanyu Wang and more

Transformer structures have been widely used in sequential recommender systems (SRS). However, as user interaction histories increase, computational time and memory requirements also grow. This is mainly caused by the standard attention mechanism. Although there exist many methods employing efficient attention and SSM-based models, these approaches struggle to effectively model long sequences and may exhibit unstable performance on short sequences. To address these challenges, we design a sparse attention mechanism, BlossomRec, which models both long-term and short-term user interests through attention computation to achieve stable performance across sequences of varying lengths. Specifically, we categorize user interests in recommendation systems into long-term and short-term interests, and compute them using two distinct sparse attention patterns, with the results combined through a learnable gated output. Theoretically, it significantly reduces the number of interactions participating in attention computation. Extensive experiments on four public datasets demonstrate that BlossomRec, when integrated with state-of-the-art Transformer-based models, achieves comparable or even superior performance while significantly reducing memory usage, providing strong evidence of BlossomRec's efficiency and effectiveness.The code is available at https://github.com/ronineume/BlossomRec.

The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs

Computation and Language

Makes AI understand much longer stories.

24 Apr 2025 0

87%

Why Generate When You Can Transform? Unleashing Generative Attention for Dynamic Recommendation

Information Retrieval

Helps apps guess what you'll like next.

4 Aug 2025 1

87%

Understanding and Enhancing Mamba-Transformer Hybrids for Memory Recall and Language Modeling

Computation and Language

Makes AI understand long stories better.

30 Oct 2025 0

View PDF Login to Bookmark

BlossomRec: Block-level Fused Sparse Attention Mechanism for Sequential Recommendations

Technical Abstract

The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs

Why Generate When You Can Transform? Unleashing Generative Attention for Dynamic Recommendation

Understanding and Enhancing Mamba-Transformer Hybrids for Memory Recall and Language Modeling