Gated Rotary-Enhanced Linear Attention for Long-term Sequential Recommendation
By: Juntao Hu , Wei Zhou , Huayi Shen and more
Potential Business Impact:
Helps online stores guess what you'll buy next.
In Sequential Recommendation Systems (SRSs), Transformer models show remarkable performance but face computation cost challenges when modeling long-term user behavior sequences due to the quadratic complexity of the dot-product attention mechanism. By approximating the dot-product attention, linear attention provides an efficient option with linear complexity. However, existing linear attention methods face two limitations: 1) they often use learnable position encodings, which incur extra computational costs in long-term sequence scenarios, and 2) they may not consider the user's fine-grained local preferences and confuse these with the actual change of long-term interests. To remedy these drawbacks, we propose a long-term sequential Recommendation model with Gated Rotary Enhanced Linear Attention (RecGRELA). Specifically, we first propose a Rotary-Enhanced Linear Attention (RELA) module to model long-range dependency within the user's historical information using rotary position encodings. We then introduce a local short operation to incorporate local preferences and demonstrate the theoretical insight. We further introduce a SiLU-based Gated mechanism for RELA (GRELA) to help the model determine whether a user's behavior indicates local interest or a genuine shift in long-term preferences. Experimental results on four public datasets demonstrate that our RecGRELA achieves state-of-the-art performance compared to existing SRSs while maintaining low memory overhead.
Similar Papers
Breaking Complexity Barriers: High-Resolution Image Restoration with Rank Enhanced Linear Attention
CV and Pattern Recognition
Fixes blurry pictures faster, even big ones.
An Efficient Attention Mechanism for Sequential Recommendation Tasks: HydraRec
Information Retrieval
Recommends items faster for shoppers.
LREA: Low-Rank Efficient Attention on Modeling Long-Term User Behaviors for CTR Prediction
Information Retrieval
Makes ads show up faster and better.