DACP: Domain-Adaptive Continual Pre-Training of Large Language Models for Phone Conversation Summarization
By: Xue-Yong Fu , Elena Khasanova , Md Tahmid Rahman Laskar and more
Potential Business Impact:
Teaches computers to summarize messy talks better.
Large language models (LLMs) have achieved impressive performance in text summarization, yet their performance often falls short when applied to specialized domains %or conversational data that differ from their original pre-training distribution. While fine-tuning can improve summarization quality, it typically relies on costly and scarce high-quality labeled data. In this work, we explore continual pre-training as a scalable, self-supervised approach to adapt LLMs for downstream summarization tasks, particularly in the context of noisy real-world conversation transcripts. We conduct extensive experiments using large-scale, unlabeled business conversation data to investigate whether continual pre-training enhances model capabilities in conversational summarization. Our results demonstrate that continual pre-training yields substantial gains in both in-domain and out-of-domain summarization benchmarks, while maintaining strong generalization and robustness. We also analyze the effects of data selection strategies, providing practical guidelines for applying continual pre-training in summarization-focused industrial applications.
Similar Papers
DACP: Domain-Adaptive Continual Pre-Training of Large Language Models for Phone Conversation Summarization
Computation and Language
Makes AI better at summarizing messy conversations.
Domain-Adaptive Continued Pre-Training of Small Language Models
Computation and Language
Makes small AI smarter with less computer power.
MachineLearningLM: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML
Computation and Language
Teaches computers to learn from many examples.