Dual-Path Stable Soft Prompt Generation for Domain Generalization
By: Yuedi Zhang , Shuanghao Bai , Wanqi Zhou and more
Potential Business Impact:
Makes AI understand new things better, even if different.
Domain generalization (DG) aims to learn a model using data from one or multiple related but distinct source domains that can generalize well to unseen out-of-distribution target domains. Inspired by the success of large pre-trained vision-language models (VLMs), prompt tuning has emerged as an effective generalization strategy. However, it often struggles to capture domain-specific features due to its reliance on manually or fixed prompt inputs. Recently, some prompt generation methods have addressed this limitation by dynamically generating instance-specific and domain-specific prompts for each input, enriching domain information and demonstrating potential for enhanced generalization. Through further investigation, we identify a notable issue in existing prompt generation methods: the same input often yields significantly different and suboptimal prompts across different random seeds, a phenomenon we term Prompt Variability. To address this, we introduce negative learning into the prompt generation process and propose Dual-Path Stable Soft Prompt Generation (DPSPG), a transformer-based framework designed to improve both the stability and generalization of prompts. Specifically, DPSPG incorporates a complementary prompt generator to produce negative prompts, thereby reducing the risk of introducing misleading information. Both theoretical and empirical analyses demonstrate that negative learning leads to more robust and effective prompts by increasing the effective margin and reducing the upper bound of the gradient norm. Extensive experiments on five DG benchmark datasets show that DPSPG consistently outperforms state-of-the-art methods while maintaining prompt stability.
Similar Papers
Federated Domain Generalization with Domain-specific Soft Prompts Generation
CV and Pattern Recognition
Helps AI learn from different data sources.
Multi-Modal Style Transfer-based Prompt Tuning for Efficient Federated Domain Generalization
Distributed, Parallel, and Cluster Computing
Helps AI learn from many different data sources.
Generalizing Vision-Language Models with Dedicated Prompt Guidance
CV and Pattern Recognition
Helps AI understand new things better.