VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control
Zi-Yuan Hu, Yanyang Li, Michael R. Lyu, Liwei Wang

TL;DR
VL-PET introduces a flexible, efficient framework for vision-and-language tuning that controls modular modifications via granularity, improving performance and transferability across multiple tasks.
Contribution
The paper proposes a novel granularity-controlled mechanism for parameter-efficient tuning in vision-and-language models, enabling better efficiency and effectiveness trade-offs.
Findings
VL-PET outperforms VL-Adapter and LoRA on image-text tasks.
The framework improves transferability across tasks.
Lightweight PET modules enhance VL alignment and text generation.
Abstract
As the model size of pre-trained language models (PLMs) grows rapidly, full fine-tuning becomes prohibitively expensive for model training and storage. In vision-and-language (VL), parameter-efficient tuning (PET) techniques are proposed to integrate modular modifications (e.g., Adapter and LoRA) into encoder-decoder PLMs. By tuning a small set of trainable parameters, these techniques perform on par with full fine-tuning. However, excessive modular modifications and neglecting the functionality gap between the encoders and decoders can lead to performance degradation, while existing PET techniques (e.g., VL-Adapter) overlook these critical issues. In this paper, we propose a Vision-and-Language Parameter-Efficient Tuning (VL-PET) framework to impose effective control over modular modifications via a novel granularity-controlled mechanism. Considering different granularity-controlled…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning
MethodsAdapter
