Scalable Strategies for Continual Learning with Replay
By: Truman Hickok
Potential Business Impact:
Teaches computers to learn new things without forgetting.
Future deep learning models will be distinguished by systems that perpetually learn through interaction, imagination, and cooperation, blurring the line between training and inference. This makes continual learning a critical challenge, as methods that efficiently maximize bidirectional transfer across learning trajectories will be essential. Replay is on track to play a foundational role in continual learning, allowing models to directly reconcile new information with past knowledge. In practice, however, replay is quite unscalable, doubling the cost of continual learning when applied naively. Moreover, the continual learning literature has not fully synchronized with the multi-task fine-tuning literature, having not fully integrated highly scalable techniques like model merging and low rank adaptation into a replay-enabled toolset that can produce a unified model in the face of many sequential tasks. In this paper, we begin by applying and analyzing low rank adaptation in a continual learning setting. Next, we introduce consolidation, a phasic approach to replay which leads to up to 55\% less replay samples being needed for a given performance target. Then, we propose sequential merging, an offshoot of task arithmetic which is tailored to the continual learning setting and is shown to work well in combination with replay. Finally, we demonstrate that the developed strategies can operate synergistically, resulting in a highly scalable toolset that outperforms standalone variants.
Similar Papers
Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models
Machine Learning (CS)
Keeps AI smart without forgetting old lessons.
Replay Can Provably Increase Forgetting
Machine Learning (CS)
Keeps computers remembering old lessons while learning new ones.
Prototype-Based Continual Learning with Label-free Replay Buffer and Cluster Preservation Loss
Machine Learning (CS)
Computers learn new things without forgetting old ones.