Continuous Subspace Optimization for Continual Learning
By: Quan Cheng , Yuanyu Wan , Lingyu Wu and more
Potential Business Impact:
Helps computers learn new things without forgetting old ones.
Continual learning aims to learn multiple tasks sequentially while preserving prior knowledge, but faces the challenge of catastrophic forgetting when acquiring new knowledge. Recently, approaches leveraging pre-trained models have gained increasing popularity to mitigate this issue, due to the strong generalization ability of foundation models. To adjust pre-trained models for new tasks, existing methods usually employ low-rank adaptation, which restricts parameter updates to a fixed low-rank subspace. However, constraining the optimization space inherently compromises the model's learning capacity, resulting in inferior performance. To address the limitation, we propose Continuous Subspace Optimization for Continual Learning (CoSO) to fine-tune the model in a series of subspaces rather than a single one. These sequential subspaces are dynamically determined through the singular value decomposition of gradients. CoSO updates the model by projecting gradients into these subspaces, ensuring memory-efficient optimization. To mitigate forgetting, the optimization subspaces of each task are set to be orthogonal to the historical task subspace. During task learning, CoSO maintains a task-specific component that captures the critical update directions associated with the current task. Upon completing a task, this component is used to update the historical task subspace, laying the groundwork for subsequent learning. Extensive experiments on multiple datasets demonstrate that CoSO significantly outperforms state-of-the-art methods, especially in challenging scenarios with long task sequences.
Similar Papers
Sculpting Subspaces: Constrained Full Fine-Tuning in LLMs for Continual Learning
Machine Learning (CS)
Keeps AI smart on new tasks, not forgetting old ones.
Gradient-free Continual Learning
Machine Learning (CS)
Teaches computers new things without forgetting old ones.
Forward-Only Continual Learning
Machine Learning (CS)
Teaches computers new things without forgetting old ones.