Score: 1

O-TPT: Orthogonality Constraints for Calibrating Test-time Prompt Tuning in Vision-Language Models

Published: March 15, 2025 | arXiv ID: 2503.12096v1

By: Ashshak Sharifdeen , Muhammad Akhtar Munir , Sanoojan Baliah and more

Potential Business Impact:

Makes AI image guesses more trustworthy and accurate.

Business Areas:

A/B Testing Data and Analytics

Test-time prompt tuning for vision-language models (VLMs) is getting attention because of their ability to learn with unlabeled data without fine-tuning. Although test-time prompt tuning methods for VLMs can boost accuracy, the resulting models tend to demonstrate poor calibration, which casts doubts on the reliability and trustworthiness of these models. Notably, more attention needs to be devoted to calibrating the test-time prompt tuning in vision-language models. To this end, we propose a new approach, called O-TPT that introduces orthogonality constraints on the textual features corresponding to the learnable prompts for calibrating test-time prompt tuning in VLMs. Towards introducing orthogonality constraints, we make the following contributions. First, we uncover new insights behind the suboptimal calibration performance of existing methods relying on textual feature dispersion. Second, we show that imposing a simple orthogonalization of textual features is a more effective approach towards obtaining textual dispersion. We conduct extensive experiments on various datasets with different backbones and baselines. The results indicate that our method consistently outperforms the prior state of the art in significantly reducing the overall average calibration error. Also, our method surpasses the zero-shot calibration performance on fine-grained classification tasks.

A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models

CV and Pattern Recognition

Makes AI better at understanding new things.

30 Oct 2025 1

92%

D-TPT: Dimensional Entropy Maximization for Calibrating Test-Time Prompt Tuning in Vision-Language Models

CV and Pattern Recognition

Makes AI better at understanding new things.

10 Oct 2025 0

91%

R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning

Machine Learning (CS)

Protects AI from tricky, fake pictures.

15 Apr 2025 1

View PDF Login to Bookmark

Country of Origin

🇦🇪 United Arab Emirates

Page Count

14 pages

O-TPT: Orthogonality Constraints for Calibrating Test-time Prompt Tuning in Vision-Language Models

Makes AI image guesses more trustworthy and accurate.

Technical Abstract

A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models

D-TPT: Dimensional Entropy Maximization for Calibrating Test-Time Prompt Tuning in Vision-Language Models

R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning