Score: 2

Large Language Model Empowered Recommendation Meets All-domain Continual Pre-Training

Published: April 11, 2025 | arXiv ID: 2504.08949v1

By: Haokai Ma , Yunshan Ma , Ruobing Xie and more

BigTech Affiliations: Tencent

Potential Business Impact:

Helps computers suggest things you'll like.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Recent research efforts have investigated how to integrate Large Language Models (LLMs) into recommendation, capitalizing on their semantic comprehension and open-world knowledge for user behavior understanding. These approaches predominantly employ supervised fine-tuning on single-domain user interactions to adapt LLMs for specific recommendation tasks. However, they typically encounter dual challenges: the mismatch between general language representations and domain-specific preference patterns, as well as the limited adaptability to multi-domain recommendation scenarios. To bridge these gaps, we introduce CPRec -- an All-domain Continual Pre-Training framework for Recommendation -- designed to holistically align LLMs with universal user behaviors through the continual pre-training paradigm. Specifically, we first design a unified prompt template and organize users' multi-domain behaviors into domain-specific behavioral sequences and all-domain mixed behavioral sequences that emulate real-world user decision logic. To optimize behavioral knowledge infusion, we devise a Warmup-Stable-Annealing learning rate schedule tailored for the continual pre-training paradigm in recommendation to progressively enhance the LLM's capability in knowledge adaptation from open-world knowledge to universal recommendation tasks. To evaluate the effectiveness of our CPRec, we implement it on a large-scale dataset covering seven domains and conduct extensive experiments on five real-world datasets from two distinct platforms. Experimental results confirm that our continual pre-training paradigm significantly mitigates the semantic-behavioral discrepancy and achieves state-of-the-art performance in all recommendation scenarios. The source code will be released upon acceptance.

Robust Uncertainty Quantification for Self-Evolving Large Language Models via Continual Domain Pretraining

Machine Learning (CS)

Helps AI learn new things without forgetting old ones.

27 Oct 2025 0

90%

DACP: Domain-Adaptive Continual Pre-Training of Large Language Models for Phone Conversation Summarization

Computation and Language

Teaches computers to summarize messy talks better.

7 Oct 2025 0

89%

DACP: Domain-Adaptive Continual Pre-Training of Large Language Models for Phone Conversation Summarization

Computation and Language

Makes AI better at summarizing messy conversations.

7 Oct 2025 0

View PDF Login to Bookmark

Country of Origin

🇸🇬 🇨🇳 Singapore, China

Page Count

10 pages

Large Language Model Empowered Recommendation Meets All-domain Continual Pre-Training

Helps computers suggest things you'll like.

Technical Abstract

Robust Uncertainty Quantification for Self-Evolving Large Language Models via Continual Domain Pretraining

DACP: Domain-Adaptive Continual Pre-Training of Large Language Models for Phone Conversation Summarization

DACP: Domain-Adaptive Continual Pre-Training of Large Language Models for Phone Conversation Summarization