Debiased Fine-Tuning for Vision-language Models by Prompt Regularization

Beier Zhu; Yulei Niu; Saeil Lee; Minhoe Hur; Hanwang Zhang

arXiv:2301.12429·cs.CV·August 14, 2025·1 cites

Debiased Fine-Tuning for Vision-language Models by Prompt Regularization

Beier Zhu, Yulei Niu, Saeil Lee, Minhoe Hur, Hanwang Zhang

PDF

Open Access 1 Video

TL;DR

This paper introduces Prompt Regularization (ProReg), a fine-tuning method for vision-language models that leverages prompt-based predictions to prevent overfitting and improve out-of-distribution performance.

Contribution

ProReg is a novel fine-tuning approach that uses prompt-based regularization with adaptive weighting to better utilize pretraining knowledge and reduce bias from downstream data.

Findings

01

ProReg outperforms traditional fine-tuning and prompt tuning on various benchmarks.

02

It effectively reduces overfitting by leveraging prompt-based predictions.

03

ProReg demonstrates strong out-of-distribution generalization.

Abstract

We present a new paradigm for fine-tuning large-scale visionlanguage pre-trained models on downstream task, dubbed Prompt Regularization (ProReg). Different from traditional fine-tuning which easily overfits to the downstream task data, ProReg uses the prediction by prompting the pretrained model to regularize the fine-tuning. The motivation is: by prompting the large model "a photo of a [CLASS]", the fil-lin answer is only dependent on the pretraining encyclopedic knowledge while independent of the task data distribution, which is usually biased. Specifically, given a training sample prediction during fine-tuning, we first calculate its KullbackLeibler loss of the prompt prediction and Cross-Entropy loss of the ground-truth label, and then combine them with a proposed sample-wise adaptive trade-off weight, which automatically adjusts the transfer between the pretrained and downstream…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Debiased Fine-Tuning for Vision-language Models by Prompt Regularization· underline

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications