Score: 0

On Understanding of the Dynamics of Model Capacity in Continual Learning

Published: August 11, 2025 | arXiv ID: 2508.08052v1

By: Supriyo Chakraborty, Krishnan Raghavan

Potential Business Impact:

Helps computers learn new things without forgetting old ones.

The stability-plasticity dilemma, closely related to a neural network's (NN) capacity-its ability to represent tasks-is a fundamental challenge in continual learning (CL). Within this context, we introduce CL's effective model capacity (CLEMC) that characterizes the dynamic behavior of the stability-plasticity balance point. We develop a difference equation to model the evolution of the interplay between the NN, task data, and optimization procedure. We then leverage CLEMC to demonstrate that the effective capacity-and, by extension, the stability-plasticity balance point is inherently non-stationary. We show that regardless of the NN architecture or optimization method, a NN's ability to represent new tasks diminishes when incoming task distributions differ from previous ones. We conduct extensive experiments to support our theoretical findings, spanning a range of architectures-from small feedforward network and convolutional networks to medium-sized graph neural networks and transformer-based large language models with millions of parameters.

Page Count
22 pages

Category
Computer Science:
Machine Learning (CS)