Mitigating Intra- and Inter-modal Forgetting in Continual Learning of Unified Multimodal Models
By: Xiwen Wei, Mustafa Munir, Radu Marculescu
Potential Business Impact:
Teaches AI to learn new things without forgetting old ones.
Unified Multimodal Generative Models (UMGMs) unify visual understanding and image generation within a single autoregressive framework. However, their ability to continually learn new tasks is severely hindered by catastrophic forgetting, both within a modality (intra-modal) and across modalities (inter-modal). While intra-modal forgetting has been studied in prior continual learning (CL) work, inter-modal forgetting remains largely unexplored. In this paper, we identify and empirically validate this phenomenon in UMGMs and provide a theoretical explanation rooted in gradient conflict between modalities. To address both intra- and inter-modal forgetting, we propose Modality-Decoupled Experts (MoDE), a lightweight and scalable architecture that isolates modality-specific updates to mitigate the gradient conflict and leverages knowledge distillation to prevent catastrophic forgetting and preserve pre-trained capabilities. Unlike previous CL methods that remain modality-coupled and suffer from modality gradient conflict, MoDE explicitly decouples modalities to prevent interference. Experiments across diverse benchmarks demonstrate that MoDE significantly mitigates both inter- and intra-modal forgetting, outperforming prior CL baselines in unified multimodal generation settings. Codes will be publicly available: https://github.com/Christina200/MoDE-official.git
Similar Papers
Multimodal Continual Learning with MLLMs from Multi-scenario Perspectives
CV and Pattern Recognition
Helps AI remember new things without forgetting old ones.
Continual Learning for Multiple Modalities
CV and Pattern Recognition
Teaches computers to learn new things without forgetting.
Continual Learning for VLMs: A Survey and Taxonomy Beyond Forgetting
CV and Pattern Recognition
Helps AI learn new things without forgetting old ones.