Bi-CoG: Bi-Consistency-Guided Self-Training for Vision-Language Models
By: Rui Zhu , Song-Lin Lv , Zi-Kang Wang and more
Potential Business Impact:
Makes AI learn better with less labeled examples.
Exploiting unlabeled data through semi-supervised learning (SSL) or leveraging pre-trained models via fine-tuning are two prevailing paradigms for addressing label-scarce scenarios. Recently, growing attention has been given to combining fine-tuning of pre-trained vision-language models (VLMs) with SSL, forming the emerging paradigm of semi-supervised fine-tuning. However, existing methods often suffer from model bias and hyperparameter sensitivity, due to reliance on prediction consistency or pre-defined confidence thresholds. To address these limitations, we propose a simple yet effective plug-and-play methodology named $\underline{\textbf{Bi-Co}}$nsistency-$\underline{\textbf{G}}$uided Self-Training (Bi-CoG), which assigns high-quality and low-bias pseudo-labels, by simultaneously exploiting inter-model and intra-model consistency, along with an error-aware dynamic pseudo-label assignment strategy. Both theoretical analysis and extensive experiments over 14 datasets demonstrate the effectiveness of Bi-CoG, which consistently and significantly improves the performance of existing methods.
Similar Papers
Improving Consistency in Large Language Models through Chain of Guidance
Computation and Language
Makes AI answers more reliable and trustworthy.
Cross-Domain Few-Shot Learning via Multi-View Collaborative Optimization with Vision-Language Models
CV and Pattern Recognition
Helps computers understand new pictures better.
Unlabeled Data or Pre-trained Model: Rethinking Semi-Supervised Learning and Pretrain-Finetuning
Machine Learning (CS)
Pre-trained AI models work better than other AI.