Comparative Analysis of Different Efficient Fine Tuning Methods of Large   Language Models (LLMs) in Low-Resource Setting

Krishna Prasad Varadarajan Srinivasan; Prasanth Gumpena; Madhusudhana; Yattapu; Vishal H. Brahmbhatt

arXiv:2405.13181·cs.CL·May 24, 2024·3 cites

Comparative Analysis of Different Efficient Fine Tuning Methods of Large Language Models (LLMs) in Low-Resource Setting

Krishna Prasad Varadarajan Srinivasan, Prasanth Gumpena, Madhusudhana, Yattapu, Vishal H. Brahmbhatt

PDF

Open Access

TL;DR

This paper compares various efficient fine-tuning methods for large language models in low-resource settings, analyzing their performance, resource requirements, and generalization capabilities across different datasets.

Contribution

It provides an extensive comparison of traditional and alternative fine-tuning strategies, including LoRA and context distillation, on diverse datasets, highlighting their relative strengths and limitations.

Findings

01

Context distillation outperforms standard fine-tuning methods.

02

PBFT under-performs Vanilla FT on out-of-domain data.

03

Adaptive fine-tuning and LoRA perform comparably or slightly worse than full fine-tuning.

Abstract

In the domain of large language models (LLMs), arXiv:2305.16938 showed that few-shot full-model fine-tuning -- namely Vanilla Fine Tuning (FT) and Pattern-Based Fine Tuning (PBFT) --, and In-Context Learning (ICL) generalize similarly on Out-Of-Domain (OOD) datasets, but vary in terms of task adaptation. However, they both pose challenges, especially in term of memory requirements. In this paper, we further try to push the understanding of different fine-tuning strategies for LLM and aim to bring a myriad of these on the same pedestal for an elaborate comparison with full-model fine-tuning on two diverse datasets. To that end, we conducted a series of experiments, beginning with state-of-the-art methods like vanilla fine-tuning and Pattern-Based Fine-Tuning (PBFT) on pre-trained models across two datasets, COLA and MNLI. We then investigate adaptive fine-tuning and the efficiency of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling

MethodsCOLA