Collaborative Learning with Multiple Foundation Models for Source-Free Domain Adaptation
By: Huisoo Lee , Jisu Han , Hyunsouk Cho and more
Potential Business Impact:
Uses multiple AI brains to improve computer vision.
Source-Free Domain Adaptation (SFDA) aims to adapt a pre-trained source model to an unlabeled target domain without access to source data. Recent advances in Foundation Models (FMs) have introduced new opportunities for leveraging external semantic knowledge to guide SFDA. However, relying on a single FM is often insufficient, as it tends to bias adaptation toward a restricted semantic coverage, failing to capture diverse contextual cues under domain shift. To overcome this limitation, we propose a Collaborative Multi-foundation Adaptation (CoMA) framework that jointly leverages two different FMs (e.g., CLIP and BLIP) with complementary properties to capture both global semantics and local contextual cues. Specifically, we employ a bidirectional adaptation mechanism that (1) aligns different FMs with the target model for task adaptation while maintaining their semantic distinctiveness, and (2) transfers complementary knowledge from the FMs to the target model. To ensure stable adaptation under mini-batch training, we introduce Decomposed Mutual Information (DMI) that selectively enhances true dependencies while suppressing false dependencies arising from incomplete class coverage. Extensive experiments demonstrate that our method consistently outperforms existing state-of-the-art SFDA methods across four benchmarks, including Office-31, Office-Home, DomainNet-126, and VisDA, under the closed-set setting, while also achieving best results on partial-set and open-set variants.
Similar Papers
SCoDA: Self-supervised Continual Domain Adaptation
CV and Pattern Recognition
Teaches computers new things without old examples.
DDFP: Data-dependent Frequency Prompt for Source Free Domain Adaptation of Medical Image Segmentation
CV and Pattern Recognition
Helps AI learn from new medical pictures.
Beyond Boundaries: Leveraging Vision Foundation Models for Source-Free Object Detection
CV and Pattern Recognition
Teaches computers to find objects in new places.