Revisiting semi-supervised learning in the era of foundation models
By: Ping Zhang , Zheda Mai , Quang-Huy Nguyen and more
Potential Business Impact:
Makes AI learn better with less labeled pictures.
Semi-supervised learning (SSL) leverages abundant unlabeled data alongside limited labeled data to enhance learning. As vision foundation models (VFMs) increasingly serve as the backbone of vision applications, it remains unclear how SSL interacts with these pre-trained models. To address this gap, we develop new SSL benchmark datasets where frozen VFMs underperform and systematically evaluate representative SSL methods. We make a surprising observation: parameter-efficient fine-tuning (PEFT) using only labeled data often matches SSL performance, even without leveraging unlabeled data. This motivates us to revisit self-training, a conceptually simple SSL baseline, where we use the supervised PEFT model to pseudo-label unlabeled data for further training. To overcome the notorious issue of noisy pseudo-labels, we propose ensembling multiple PEFT approaches and VFM backbones to produce more robust pseudo-labels. Empirical results validate the effectiveness of this simple yet powerful approach, providing actionable insights into SSL with VFMs and paving the way for more scalable and practical semi-supervised learning in the era of foundation models.
Similar Papers
Solving Semi-Supervised Few-Shot Learning from an Auto-Annotation Perspective
CV and Pattern Recognition
Teaches computers to label pictures with little help.
Bridging Brain with Foundation Models through Self-Supervised Learning
Machine Learning (CS)
Lets computers understand brain signals without labels.
ULFine: Unbiased Lightweight Fine-tuning for Foundation-Model-Assisted Long-Tailed Semi-Supervised Learning
CV and Pattern Recognition
Helps computers learn rare things better and faster.