Alternating Training-based Label Smoothing Enhances Prompt Generalization

Yang Chen; Yanbin Wei; Ke Jin; Yi Kong; James Kwok; Yu Zhang

arXiv:2508.17846·cs.CV·October 2, 2025

Alternating Training-based Label Smoothing Enhances Prompt Generalization

Yang Chen, Yanbin Wei, Ke Jin, Yi Kong, James Kwok, Yu Zhang

PDF

TL;DR

This paper introduces ATLaS, an alternating training-based label smoothing technique that improves the generalization of prompt tuning in vision-language models by integrating soft labels and theoretical analysis.

Contribution

It proposes a novel ATLaS method that alternates training with hard and soft labels, enhancing prompt tuning's generalization ability with theoretical insights and efficient soft label strategies.

Findings

01

ATLaS consistently improves prompt tuning performance.

02

Combining ATLaS with CSL and ISL enhances generalization.

03

ATLaS is compatible with existing prompt tuning methods.

Abstract

Recent advances in pre-trained vision-language models have demonstrated remarkable zero-shot generalization capabilities. To further enhance these models' adaptability to various downstream tasks, prompt tuning has emerged as a parameter-efficient fine-tuning method. However, despite its efficiency, the generalization ability of prompt remains limited. In contrast, label smoothing (LS) has been widely recognized as an effective regularization technique that prevents models from becoming over-confident and improves their generalization. This inspires us to explore the integration of LS with prompt tuning. However, we have observed that the vanilla LS even weakens the generalization ability of prompt tuning. To address this issue, we propose the Alternating Training-based Label Smoothing (ATLaS) method, which alternately trains with standard one-hot labels and soft labels generated by LS…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.