Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling
Yaqing Wang, Jialin Wu, Tanmaya Dabral, Jiageng Zhang, Geoff Brown,, Chun-Ta Lu, Frederick Liu, Yi Liang, Bo Pang, Michael Bendersky, Radu Soricut

TL;DR
This paper introduces AdaLink, a non-intrusive parameter-efficient fine-tuning method for large multimodal models, achieving competitive performance while maintaining the model's internal architecture unchanged, simplifying adaptation and deployment.
Contribution
The paper proposes AdaLink, a novel non-intrusive PEFT technique that matches state-of-the-art intrusive methods in performance for multimodal models.
Findings
AdaLink performs comparably to LoRA and full fine-tuning.
It is effective across text-only and multimodal tasks.
AdaLink simplifies model adaptation by avoiding internal architecture modifications.
Abstract
Large language models (LLMs) and vision language models (VLMs) demonstrate excellent performance on a wide range of tasks by scaling up parameter counts from O(10^9) to O(10^{12}) levels and further beyond. These large scales make it impossible to adapt and deploy fully specialized models given a task of interest. Parameter-efficient fine-tuning (PEFT) emerges as a promising direction to tackle the adaptation and serving challenges for such large models. We categorize PEFT techniques into two types: intrusive and non-intrusive. Intrusive PEFT techniques directly change a model's internal architecture. Though more flexible, they introduce significant complexities for training and serving. Non-intrusive PEFT techniques leave the internal architecture unchanged and only adapt model-external parameters, such as embeddings for input. In this work, we describe AdaLink as a non-intrusive PEFT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques
