Neural Coherence : Find higher performance to out-of-distribution tasks from few samples
By: Simon Guiroy , Mats Richter , Sarath Chandar and more
To create state-of-the-art models for many downstream tasks, it has become common practice to fine-tune a pre-trained large vision model. However, it remains an open question of how to best determine which of the many possible model checkpoints resulting from a large training run to use as the starting point. This becomes especially important when data for the target task of interest is scarce, unlabeled and out-of-distribution. In such scenarios, common methods relying on in-distribution validation data become unreliable or inapplicable. This work proposes a novel approach for model selection that operates reliably on just a few unlabeled examples from the target task. Our approach is based on a novel concept: Neural Coherence, which entails characterizing a model's activation statistics for source and target domains, allowing one to define model selection methods with high data-efficiency. We provide experiments where models are pre-trained on ImageNet1K and examine target domains consisting of Food-101, PlantNet-300K and iNaturalist. We also evaluate it in many meta-learning settings. Our approach significantly improves generalization across these different target domains compared to established baselines. We further demonstrate the versatility of Neural Coherence as a powerful principle by showing its effectiveness in training data selection.
Similar Papers
Train on Validation (ToV): Fast data selection with applications to fine-tuning
Machine Learning (CS)
Finds best examples to teach computers faster.
Conditional updates of neural network weights for increased out of training performance
Machine Learning (CS)
Teaches computers to work with new, different information.
NeuroADDA: Active Discriminative Domain Adaptation in Connectomic
CV and Pattern Recognition
Teaches computers to map brain connections faster.