Slimming Down LLMs Without Losing Their Minds

Qingda (Michael) Mai

arXiv:2506.10885·cs.CL·June 13, 2025

Slimming Down LLMs Without Losing Their Minds

Qingda (Michael) Mai

PDF

Open Access

TL;DR

This paper explores how fine-tuning large language models with parameter-efficient methods like LoRA and QLoRA can enhance task performance while maintaining efficiency, emphasizing the importance of dataset alignment.

Contribution

It provides a comprehensive evaluation of LoRA and QLoRA across multiple domains, offering theoretical insights and practical guidance for efficient LLM adaptation.

Findings

01

LoRA improves task-specific performance efficiently

02

Performance depends on dataset and benchmark alignment

03

Parameter-efficient methods are viable for resource-limited settings

Abstract

This paper investigates and validates the impact of fine-tuning on large language model performance, focusing on parameter-efficient methods (LoRA and QLoRA). We evaluate model capabilities across three key domains: (1) commonsense reasoning (HellaSwag), (2) mathematical reasoning (GSM8K), and (3) multi-domain knowledge (MMLU-CS). Our findings demonstrate that: (1) LoRA-based methods effectively improve task-specific performance while maintaining computational efficiency, and (2) performance strongly depends on alignment between fine-tuning dataset and benchmark tasks. The study provides both theoretical insights into parameter-efficient mechanisms and practical guidance for developers implementing efficient LLM adaptation with limited resources.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification