CVPR2025
Adversarial Domain Prompt Tuning and Generation for Single Domain Generalization
Zhipeng Xu, De Cheng, Xinyang Jiang, Nannan Wang, Dongsheng Li, Xinbo Gao
Abstract
Single domain generalization (SDG) aims to learn a robust model, which could perform well on many unseen domains while there is only one single domain available for training. One of the promising directions for achieving singledomain generalization is to generate out-of-domain (OOD) training data through data augmentation or image generation. Given the rapid advancements in AI-generated content (AIGC), this paper is the first to propose leveraging powerful pre-trained text-to-image (T2I) foundation models to create the training data. However, manually designing textual prompts to generate images for all possible domains is often impractical, and some domain characteristics may be too abstract to describe with words. To address these challenges, we propose a novel Progressive Adversarial Prompt Tuning (PAPT) framework for pre-trained diffusion models. Instead of relying on static textual domains, our approach learns two sets of abstract prompts as conditions for the diffusion model: one that captures domain-invariant category information and another that models domain-specific styles. This adversarial learning mechanism enables the T2I model to generate images in various domain styles while preserving key categorical features. Extensive experiments demonstrate the effectiveness of the proposed method, achieving superior performances to state-of-the-art single-domain generalization approaches. each training domain will be distributionally different from previous ones. Thus, we have a higher probability of learning more challenging abstract domains. Experiments 4.1. Datasets and Evaluation Protocols Following previous works [49, 58], we adopt five commonly used benchmark datasets in DG tasks for evaluation: PACS [32], VLCS [32], OfficeHome [62], DomainNet [45] and TerraIncognita [2]. To ensure reliable results, we calculate the average performance across multiple experiments. Implementation details In our implementation, we adopt the for for OfficeHome pretrained on ImageNet [15] as backbone for the SDG setting,