Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained   Vision-Language Models

Kecheng Zheng; Wei Wu; Ruili Feng; Kai Zhu; Jiawei Liu; Deli Zhao,; Zheng-Jun Zha; Wei Chen; Yujun Shen

arXiv:2307.15049·cs.CV·August 8, 2023

Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained Vision-Language Models

Kecheng Zheng, Wei Wu, Ruili Feng, Kai Zhu, Jiawei Liu, Deli Zhao,, Zheng-Jun Zha, Wei Chen, Yujun Shen

PDF

Open Access

TL;DR

Regularized mask tuning uncovers hidden knowledge in pre-trained vision-language models by selectively masking parameters, leading to significant performance improvements across multiple datasets without extensive retraining.

Contribution

The paper introduces a novel regularized mask tuning method that identifies and leverages important parameters for downstream tasks in pre-trained VLMs, enhancing transfer performance.

Findings

01

Achieved 18.73% improvement over zero-shot CLIP.

02

Masked an average of only 2.56% parameters.

03

Method is compatible and boosts performance with existing tuning techniques.

Abstract

Prompt tuning and adapter tuning have shown great potential in transferring pre-trained vision-language models (VLMs) to various downstream tasks. In this work, we design a new type of tuning method, termed as regularized mask tuning, which masks the network parameters through a learnable selection. Inspired by neural pathways, we argue that the knowledge required by a downstream task already exists in the pre-trained weights but just gets concealed in the upstream pre-training stage. To bring the useful knowledge back into light, we first identify a set of parameters that are important to a given downstream task, then attach a binary mask to each parameter, and finally optimize these masks on the downstream data with the parameters frozen. When updating the mask, we introduce a novel gradient dropout strategy to regularize the parameter selection, in order to prevent the model from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications

MethodsDropout · Contrastive Language-Image Pre-training · Adapter