PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization
Junhyeong Cho, Gilhyun Nam, Sungyeon Kim, Hunmin Yang, Suha Kwak

TL;DR
PromptStyler introduces a prompt-driven method to generate diverse styles in a joint vision-language space, enabling source-free domain generalization without using images, and achieves state-of-the-art results on multiple benchmarks.
Contribution
It proposes a novel prompt-based style synthesis approach in a joint space for domain generalization without source data, advancing the field's capabilities.
Findings
Achieves state-of-the-art performance on PACS, VLCS, OfficeHome, and DomainNet datasets.
Effectively synthesizes diverse styles without using images for training.
Maintains content integrity while varying styles in the joint space.
Abstract
In a joint vision-language space, a text feature (e.g., from "a photo of a dog") could effectively represent its relevant image features (e.g., from dog photos). Also, a recent study has demonstrated the cross-modal transferability phenomenon of this joint space. From these observations, we propose PromptStyler which simulates various distribution shifts in the joint space by synthesizing diverse styles via prompts without using any images to deal with source-free domain generalization. The proposed method learns to generate a variety of style features (from "a S* style of a") via learnable style word vectors for pseudo-words S*. To ensure that learned styles do not distort content information, we force style-content features (from "a S* style of a [class]") to be located nearby their corresponding content features (from "[class]") in the joint vision-language space. After learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization· youtube
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Text and Document Classification Technologies
MethodsContrastive Language-Image Pre-training · Additive Angular Margin Loss · Softmax · Transformer · Contrastive Learning
