Score: 1

MetaTPT: Meta Test-time Prompt Tuning for Vision-Language Models

Published: December 13, 2025 | arXiv ID: 2512.12268v1

By: Yuqing Lei , Yingjun Du , Yawen Huang and more

Potential Business Impact:

Helps AI understand new pictures better.

Business Areas:

A/B Testing Data and Analytics

Vision-language models (VLMs) such as CLIP exhibit strong zero-shot generalization but remain sensitive to domain shifts at test time. Test-time prompt tuning (TPT) mitigates this issue by adapting prompts with fixed augmentations, which may falter in more challenging settings. In this work, we propose Meta Test-Time Prompt Tuning (MetaTPT), a meta-learning framework that learns a self-supervised auxiliary task to guide test-time prompt tuning. The auxiliary task dynamically learns parameterized augmentations for each sample, enabling more expressive transformations that capture essential features in target domains. MetaTPT adopts a dual-loop optimization paradigm: an inner loop learns a self-supervised task that generates informative views, while the outer loop performs prompt tuning by enforcing consistency across these views. By coupling augmentation learning with prompt tuning, MetaTPT improves test-time adaptation under domain shifts. Extensive experiments demonstrate that MetaTPT achieves state-of-the-art performance on domain generalization and cross-dataset benchmarks.

R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning

Machine Learning (CS)

Protects AI from tricky, fake pictures.

15 Apr 2025 1

91%

D-TPT: Dimensional Entropy Maximization for Calibrating Test-Time Prompt Tuning in Vision-Language Models

CV and Pattern Recognition

Makes AI better at understanding new things.

10 Oct 2025 0

90%

CrossPT: Exploring Cross-Task Transferability through Multi-Task Prompt Tuning

Computation and Language

Teaches AI to do many jobs better.

11 Sep 2025 1

View PDF Login to Bookmark

Page Count

16 pages

MetaTPT: Meta Test-time Prompt Tuning for Vision-Language Models

Helps AI understand new pictures better.

Technical Abstract

R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning

D-TPT: Dimensional Entropy Maximization for Calibrating Test-Time Prompt Tuning in Vision-Language Models

CrossPT: Exploring Cross-Task Transferability through Multi-Task Prompt Tuning