Score: 1

STARS: Semantic Tokens with Augmented Representations for Recommendation at Scale

Published: December 10, 2025 | arXiv ID: 2512.10149v1

By: Han Chen, Steven Zhu, Yingrui Li

Potential Business Impact:

Shows you better stuff to buy online.

Business Areas:

Semantic Search Internet Services

Real-world ecommerce recommender systems must deliver relevant items under strict tens-of-milliseconds latency constraints despite challenges such as cold-start products, rapidly shifting user intent, and dynamic context including seasonality, holidays, and promotions. We introduce STARS, a transformer-based sequential recommendation framework built for large-scale, low-latency ecommerce settings. STARS combines several innovations: dual-memory user embeddings that separate long-term preferences from short-term session intent; semantic item tokens that fuse pretrained text embeddings, learnable deltas, and LLM-derived attribute tags, strengthening content-based matching, long-tail coverage, and cold-start performance; context-aware scoring with learned calendar and event offsets; and a latency-conscious two-stage retrieval pipeline that performs offline embedding generation and online maximum inner-product search with filtering, enabling tens-of-milliseconds response times. In offline evaluations on production-scale data, STARS improves Hit@5 by more than 75 percent relative to our existing LambdaMART system. A large-scale A/B test on 6 million visits shows statistically significant lifts, including Total Orders +0.8%, Add-to-Cart on Home +2.0%, and Visits per User +0.5%. These results demonstrate that combining semantic enrichment, multi-intent modeling, and deployment-oriented design can yield state-of-the-art recommendation quality in real-world environments without sacrificing serving efficiency.

STORE: Semantic Tokenization, Orthogonal Rotation and Efficient Attention for Scaling Up Ranking Models

Information Retrieval

Makes online recommendations faster and smarter.

24 Nov 2025 1

88%

Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders

Information Retrieval

Makes online suggestions faster with long histories.

24 Oct 2025 2

88%

STARS: Segment-level Token Alignment with Rejection Sampling in Large Language Models

Computation and Language

Makes AI understand what people want better.

5 Nov 2025 2

View PDF Login to Bookmark

Page Count

9 pages

STARS: Semantic Tokens with Augmented Representations for Recommendation at Scale

Shows you better stuff to buy online.

Technical Abstract

STORE: Semantic Tokenization, Orthogonal Rotation and Efficient Attention for Scaling Up Ranking Models

Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders

STARS: Segment-level Token Alignment with Rejection Sampling in Large Language Models