Dynamic Mixture of Experts Against Severe Distribution Shifts
By: Donghu Kim
Potential Business Impact:
Lets computers learn new things without forgetting old ones.
The challenge of building neural networks that can continuously learn and adapt to evolving data streams is central to the fields of continual learning (CL) and reinforcement learning (RL). This lifelong learning problem is often framed in terms of the plasticity-stability dilemma, focusing on issues like loss of plasticity and catastrophic forgetting. Unlike neural networks, biological brains maintain plasticity through capacity growth, inspiring researchers to explore similar approaches in artificial networks, such as adding capacity dynamically. Prior solutions often lack parameter efficiency or depend on explicit task indices, but Mixture-of-Experts (MoE) architectures offer a promising alternative by specializing experts for distinct distributions. This paper aims to evaluate a DynamicMoE approach for continual and reinforcement learning environments and benchmark its effectiveness against existing network expansion methods.
Similar Papers
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Machine Learning (CS)
Makes smart computer programs use less power.
Dynamic Mixture-of-Experts for Incremental Graph Learning
Machine Learning (CS)
Teaches computers to learn new things without forgetting old ones.
Plasticity-Aware Mixture of Experts for Learning Under QoE Shifts in Adaptive Video Streaming
Multimedia
Makes videos play better for everyone.