Score: 3

FuXi-β: Towards a Lightweight and Fast Large-Scale Generative Recommendation Model

Published: August 14, 2025 | arXiv ID: 2508.10615v1

By: Yufei Ye , Wei Guo , Hao Wang and more

BigTech Affiliations: Huawei

Potential Business Impact:

Makes recommendation systems faster and better.

Scaling laws for autoregressive generative recommenders reveal potential for larger, more versatile systems but mean greater latency and training costs. To accelerate training and inference, we investigated the recent generative recommendation models HSTU and FuXi-$\alpha$, identifying two efficiency bottlenecks: the indexing operations in relative temporal attention bias and the computation of the query-key attention map. Additionally, we observed that relative attention bias in self-attention mechanisms can also serve as attention maps. Previous works like Synthesizer have shown that alternative forms of attention maps can achieve similar performance, naturally raising the question of whether some attention maps are redundant. Through empirical experiments, we discovered that using the query-key attention map might degrade the model's performance in recommendation tasks. To address these bottlenecks, we propose a new framework applicable to Transformer-like recommendation models. On one hand, we introduce Functional Relative Attention Bias, which avoids the time-consuming operations of the original relative attention bias, thereby accelerating the process. On the other hand, we remove the query-key attention map from the original self-attention layer and design a new Attention-Free Token Mixer module. Furthermore, by applying this framework to FuXi-$\alpha$, we introduce a new model, FuXi-$\beta$. Experiments across multiple datasets demonstrate that FuXi-$\beta$ outperforms previous state-of-the-art models and achieves significant acceleration compared to FuXi-$\alpha$, while also adhering to the scaling law. Notably, FuXi-$\beta$ shows an improvement of 27% to 47% in the NDCG@10 metric on large-scale industrial datasets compared to FuXi-$\alpha$. Our code is available in a public repository: https://github.com/USTC-StarTeam/FuXi-beta

FuXi-$γ$: Efficient Sequential Recommendation with Exponential-Power Temporal Encoder and Diagonal-Sparse Positional Mechanism

Information Retrieval

Recommends better things, faster.

14 Dec 2025 3

86%

FAIR: Focused Attention Is All You Need for Generative Recommendation

Information Retrieval

Helps online stores show you better stuff.

12 Dec 2025 1

86%

From Scaling to Structured Expressivity: Rethinking Transformers for CTR Prediction

Information Retrieval

Helps online ads show better by understanding user choices.

15 Nov 2025 2

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Repos / Data Links

github.com

Page Count

11 pages

FuXi-β: Towards a Lightweight and Fast Large-Scale Generative Recommendation Model

Makes recommendation systems faster and better.

Technical Abstract

FuXi-$γ$: Efficient Sequential Recommendation with Exponential-Power Temporal Encoder and Diagonal-Sparse Positional Mechanism

FAIR: Focused Attention Is All You Need for Generative Recommendation

From Scaling to Structured Expressivity: Rethinking Transformers for CTR Prediction