TAME: Test-Time Adversarial Prompt Tuning via Mixture-of-Experts for Vision-Language Models
Xin Wang, Yixu Wang, Jiaming Zhang, Ruofan Wang, Jiaqi Yu, Kai Chen, Jingjing Chen, Xingjun Ma, Yu-Gang Jiang

TL;DR
TAME is a test-time defense method that enhances the adversarial robustness of vision-language models like CLIP by using a mixture-of-experts prompt tuning approach driven by unsupervised objectives.
Contribution
It introduces a novel input-conditioned MoE framework for adaptive prompt tuning at test time, improving robustness without retraining the entire model.
Findings
TAME increases CLIP's adversarial robustness by at least 49.1% under AutoAttack.
It outperforms existing prompt tuning methods with an average robustness gain of 30.2%.
TAME maintains high accuracy on clean samples while improving robustness.
Abstract
Large-scale pre-trained Vision-Language models (VLMs), such as CLIP, exhibit strong zero-shot generalization, yet remain highly vulnerable to imperceptible adversarial perturbations, raising serious safety concerns for open-world deployment. To enhance robustness without requiring downstream task-specific retraining, we propose TAME, a novel test-time defense. Building upon our prior Test-Time Adversarial Prompt Tuning (TAPT), TAME introduces an architectural reformulation by replacing TAPT's single adaptive prompt with an input-conditioned Mixture-of-Experts (MoE) framework, enabling more expressive and adaptive defense. Specifically, TAME maintains a bank of learnable expert prompts and employs an input-dependent routing mechanism to aggregate a customized prompt mixture for each unlabeled test sample at inference time. This test-time defense mechanism is driven by three unsupervised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
