Tailored-LLaMA: Optimizing Few-Shot Learning in Pruned LLaMA Models with   Task-Specific Prompts

Danyal Aftab; Steven Davy

arXiv:2410.19185·cs.AI·January 10, 2025

Tailored-LLaMA: Optimizing Few-Shot Learning in Pruned LLaMA Models with Task-Specific Prompts

Danyal Aftab, Steven Davy

PDF

TL;DR

This paper introduces Tailored LLaMA, a method for efficient few-shot learning on pruned LLaMA models using task-specific prompts and LoRA, achieving high accuracy with significantly reduced model sizes.

Contribution

The paper presents a novel approach combining structural pruning, task-specific prompts, and LoRA for effective fine-tuning of pruned LLaMA models in few-shot learning scenarios.

Findings

01

Fine-tuning pruned models restores high accuracy in classification tasks.

02

Models retain over 65 ext{%} of baseline accuracy after 50 ext{%} pruning.

03

Fine-tuning less than one hour achieves near-baseline performance.

Abstract

Large language models demonstrate impressive proficiency in language understanding and generation. Nonetheless, training these models from scratch, even the least complex billion-parameter variant demands significant computational resources rendering it economically impractical for many organizations. With large language models functioning as general-purpose task solvers, this paper investigates their task-specific fine-tuning. We employ task-specific datasets and prompts to fine-tune two pruned LLaMA models having 5 billion and 4 billion parameters. This process utilizes the pre-trained weights and focuses on a subset of weights using the LoRA method. One challenge in fine-tuning the LLaMA model is crafting a precise prompt tailored to the specific task. To address this, we propose a novel approach to fine-tune the LLaMA model under two primary constraints: task specificity and prompt…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning · LLaMA