Replay to Remember: Retaining Domain Knowledge in Streaming Language Models
By: Sneh Pillai
Potential Business Impact:
Keeps AI smart when learning new things.
Continual learning in large language models (LLMs) typically encounters the critical challenge of catastrophic forgetting, where previously acquired knowledge deteriorates upon exposure to new data. While techniques like replay buffers and parameter-efficient tuning (e.g., Low-Rank Adaptation or LoRA) have been proposed, few studies investigate real-time domain adaptation under strict computational and data-stream constraints. In this paper, we demonstrate a lightweight method combining LoRA and a minimal replay mechanism in a realistic streaming setting across three diverse knowledge domains: medical question answering, genetics, and law. Using perplexity, semantic similarity, and GPT-based human-like evaluation metrics, we quantify the model's adaptation, forgetting, and recovery over time. Our experiments reveal that while catastrophic forgetting naturally occurs, even minimal replay significantly stabilizes and partially restores domain-specific knowledge. This study contributes practical insights for deploying adaptable LLMs in resource-constrained, real-world scenarios.
Similar Papers
Mitigating Catastrophic Forgetting in Streaming Generative and Predictive Learning via Stateful Replay
Machine Learning (CS)
Keeps computer learning without forgetting old lessons.
GeRe: Towards Efficient Anti-Forgetting in Continual Learning of LLM via General Samples Replay
Computation and Language
Keeps AI smart when learning new things.
Mitigating Catastrophic Forgetting and Mode Collapse in Text-to-Image Diffusion via Latent Replay
Machine Learning (CS)
Teaches AI to learn new things without forgetting.