AttenDence: Maximizing Attention Confidence for Test Time Adaptation
By: Yash Mali
Potential Business Impact:
Helps AI see better when pictures change.
Test-time adaptation (TTA) enables models to adapt to distribution shifts at inference time. While entropy minimization over the output distribution has proven effective for TTA, transformers offer an additional unsupervised learning signal through their attention mechanisms. We propose minimizing the entropy of attention distributions from the CLS token to image patches as a novel TTA objective.This approach encourages the model to attend more confidently to relevant image regions under distribution shift and is effective even when only a single test image is available. We demonstrate that attention entropy minimization improves robustness across diverse corruption types while not hurting performance on clean data on a single sample stream of images at test time.
Similar Papers
Open-World Test-Time Adaptation with Hierarchical Feature Aggregation and Attention Affine
CV and Pattern Recognition
Helps AI tell real from fake, even when surprised.
Adapt in the Wild: Test-Time Entropy Minimization with Sharpness and Feature Regularization
Machine Learning (CS)
Makes AI smarter even with messy, changing data.
Adaptive Cache Enhancement for Test-Time Adaptation of Vision-Language Models
CV and Pattern Recognition
Helps AI see better when things look different.