Augmenting Continual Learning of Diseases with LLM-Generated Visual Concepts
By: Jiantao Tan , Peixian Ma , Kanghao Chen and more
Potential Business Impact:
Helps AI learn new medical images better.
Continual learning is essential for medical image classification systems to adapt to dynamically evolving clinical environments. The integration of multimodal information can significantly enhance continual learning of image classes. However, while existing approaches do utilize textual modality information, they solely rely on simplistic templates with a class name, thereby neglecting richer semantic information. To address these limitations, we propose a novel framework that harnesses visual concepts generated by large language models (LLMs) as discriminative semantic guidance. Our method dynamically constructs a visual concept pool with a similarity-based filtering mechanism to prevent redundancy. Then, to integrate the concepts into the continual learning process, we employ a cross-modal image-concept attention module, coupled with an attention loss. Through attention, the module can leverage the semantic knowledge from relevant visual concepts and produce class-representative fused features for classification. Experiments on medical and natural image datasets show our method achieves state-of-the-art performance, demonstrating the effectiveness and superiority of our method. We will release the code publicly.
Similar Papers
Forging a Dynamic Memory: Retrieval-Guided Continual Learning for Generalist Medical Foundation Models
CV and Pattern Recognition
Helps AI learn new medical images better.
Retrieval-Augmented VLMs for Multimodal Melanoma Diagnosis
CV and Pattern Recognition
Helps doctors spot skin cancer faster and better.
Latent Implicit Visual Reasoning
CV and Pattern Recognition
Computers learn to understand pictures better on their own.