Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data

Parth Patwa; Simone Filice; Zhiyu Chen; Giuseppe Castellucci; Oleg; Rokhlenko; Shervin Malmasi

arXiv:2404.02422·cs.CL·April 4, 2024·1 cites

Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data

Parth Patwa, Simone Filice, Zhiyu Chen, Giuseppe Castellucci, Oleg, Rokhlenko, Shervin Malmasi

PDF

Open Access

TL;DR

This paper introduces a method combining synthetic data generation and parameter-efficient fine-tuning to enhance low-resource large language model classification, achieving efficiency comparable to zero-shot methods and accuracy close to in-context learning.

Contribution

The paper presents a novel approach that uses synthetic data and PEFT to improve low-resource LLM classification without the high computational cost of ICL.

Findings

01

Achieves competitive accuracy with minimal data.

02

Reduces computational cost compared to ICL.

03

Effective across multiple datasets.

Abstract

Large Language Models (LLMs) operating in 0-shot or few-shot settings achieve competitive results in Text Classification tasks. In-Context Learning (ICL) typically achieves better accuracy than the 0-shot setting, but it pays in terms of efficiency, due to the longer input prompt. In this paper, we propose a strategy to make LLMs as efficient as 0-shot text classifiers, while getting comparable or better accuracy than ICL. Our solution targets the low resource setting, i.e., when only 4 examples per class are available. Using a single LLM and few-shot real data we perform a sequence of generation, filtering and Parameter-Efficient Fine-Tuning steps to create a robust and efficient classifier. Experimental results show that our approach leads to competitive results on multiple text classification datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Storage Technologies · Advanced Materials Characterization Techniques · Reservoir Engineering and Simulation Methods