Semantics Meet Signals: Dual Codebook Representationl Learning for Generative Recommendation
By: Zheng Hui , Xiaokai Wei , Reza Shirkavand and more
Potential Business Impact:
Helps online stores show you better stuff.
Generative recommendation has recently emerged as a powerful paradigm that unifies retrieval and generation, representing items as discrete semantic tokens and enabling flexible sequence modeling with autoregressive models. Despite its success, existing approaches rely on a single, uniform codebook to encode all items, overlooking the inherent imbalance between popular items rich in collaborative signals and long-tail items that depend on semantic understanding. We argue that this uniform treatment limits representational efficiency and hinders generalization. To address this, we introduce FlexCode, a popularity-aware framework that adaptively allocates a fixed token budget between a collaborative filtering (CF) codebook and a semantic codebook. A lightweight MoE dynamically balances CF-specific precision and semantic generalization, while an alignment and smoothness objective maintains coherence across the popularity spectrum. We perform experiments on both public and industrial-scale datasets, showing that FlexCode consistently outperform strong baselines. FlexCode provides a new mechanism for token representation in generative recommenders, achieving stronger accuracy and tail robustness, and offering a new perspective on balancing memorization and generalization in token-based recommendation models.
Similar Papers
DiscRec: Disentangled Semantic-Collaborative Modeling for Generative Recommendation
Information Retrieval
Recommends better by separating item types.
Beyond Semantic Understanding: Preserving Collaborative Frequency Components in LLM-based Recommendation
Computation and Language
Makes online suggestions better by mixing ideas and past choices.
A Theoretically-Grounded Codebook for Digital Semantic Communications
Information Theory
Makes computers understand pictures better.