An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal   Large Language Models

Xiongtao Zhou; Jie He; Yuhua Ke; Guangyao Zhu; V\'ictor; Guti\'errez-Basulto; Jeff Z. Pan

arXiv:2406.05130·cs.CL·June 10, 2024

An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models

Xiongtao Zhou, Jie He, Yuhua Ke, Guangyao Zhu, V\'ictor, Guti\'errez-Basulto, Jeff Z. Pan

PDF

Open Access 1 Repo

TL;DR

This study empirically evaluates parameter-efficient fine-tuning methods for multimodal large language models, demonstrating that adapters generally outperform other methods in enhancing model performance with limited parameter updates.

Contribution

It provides a comprehensive empirical analysis of four PEFT methods across multiple models, datasets, and scenarios, highlighting the effectiveness of adapters and connector layer fine-tuning.

Findings

01

Adapters are the best-performing PEFT method across experiments.

02

Fine-tuning connector layers improves performance in most MLLMs.

03

PEFT methods impact model stability, generalization, and hallucination tendencies.

Abstract

Multimodal large language models (MLLMs) fine-tuned with multimodal instruction datasets have demonstrated remarkable capabilities in multimodal tasks. However, fine-tuning all parameters of MLLMs has become challenging as they usually contain billions of parameters. To address this issue, we study parameter-efficient fine-tuning (PEFT) methods for MLLMs. We aim to identify effective methods for enhancing the performance of MLLMs in scenarios where only a limited number of parameters are trained. This paper conducts empirical studies using four popular PEFT methods to fine-tune the LLM component of open-source MLLMs. We present a comprehensive analysis that encompasses various aspects, including the impact of PEFT methods on various models, parameters and location of the PEFT module, size of fine-tuning data, model stability based on PEFT methods, MLLM's generalization, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alenai97/peft-mllm
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsAdapter