Regularized Training with Generated Datasets for Name-Only Transfer of   Vision-Language Models

Minho Park; Sunghyun Park; Jooyeol Yun; Jaegul Choo

arXiv:2406.05432·cs.CV·June 11, 2024

Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models

Minho Park, Sunghyun Park, Jooyeol Yun, Jaegul Choo

PDF

Open Access 1 Repo

TL;DR

This paper introduces regularization techniques to improve fine-tuning of vision-language models on generated datasets, effectively addressing domain gaps and enhancing performance in name-only transfer scenarios.

Contribution

It proposes novel regularization methods for training and post-training to mitigate domain gaps in generated datasets for vision-language models.

Findings

01

Regularization improves model performance on real data.

02

Feature diversity correlates with better transfer results.

03

Methods achieve state-of-the-art performance on multiple datasets.

Abstract

Recent advancements in text-to-image generation have inspired researchers to generate datasets tailored for perception models using generative models, which prove particularly valuable in scenarios where real-world data is limited. In this study, our goal is to address the challenges when fine-tuning vision-language models (e.g., CLIP) on generated datasets. Specifically, we aim to fine-tune vision-language models to a specific classification model without access to any real images, also known as name-only transfer. However, despite the high fidelity of generated images, we observed a significant performance degradation when fine-tuning the model using the generated datasets due to the domain gap between real and generated images. To overcome the domain gap, we provide two regularization methods for training and post-training, respectively. First, we leverage the domain-agnostic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pmh9960/regft-for-gen
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications