Bi-CoG: Bi-Consistency-Guided Self-Training for Vision-Language Models

Rui Zhu; Song-Lin Lv; Zi-Kang Wang; Lan-Zhe Guo

arXiv:2510.20477·cs.LG·May 12, 2026

Bi-CoG: Bi-Consistency-Guided Self-Training for Vision-Language Models

Rui Zhu, Song-Lin Lv, Zi-Kang Wang, Lan-Zhe Guo

PDF

TL;DR

Bi-CoG introduces a novel self-training method that enhances vision-language model fine-tuning by leveraging bi-consistency and dynamic pseudo-labeling, improving performance across multiple datasets.

Contribution

The paper presents Bi-CoG, a simple plug-and-play approach that reduces bias and hyperparameter sensitivity in semi-supervised fine-tuning of vision-language models.

Findings

01

Bi-CoG significantly outperforms existing methods on 14 datasets.

02

Bi-CoG effectively reduces model bias and hyperparameter sensitivity.

03

Theoretical analysis supports the robustness of Bi-CoG.

Abstract

Exploiting unlabeled data through semi-supervised learning (SSL) or leveraging pre-trained models via fine-tuning are two prevailing paradigms for addressing label-scarce scenarios. Recently, growing attention has been given to combining fine-tuning of pre-trained vision-language models (VLMs) with SSL, forming the emerging paradigm of semi-supervised fine-tuning. However, existing methods often suffer from model bias and hyperparameter sensitivity, due to reliance on prediction consistency or pre-defined confidence thresholds. To address these limitations, we propose a simple yet effective plug-and-play methodology named $\underline{Bi-Co}$ nsistency- $\underline{G}$ uided Self-Training (Bi-CoG), which assigns high-quality and low-bias pseudo-labels, by simultaneously exploiting inter-model and intra-model consistency, along with an error-aware dynamic pseudo-label…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.