Beware of the Batch Size: Hyperparameter Bias in Evaluating LoRA
Sangyoon Lee, Jaeho Lee

TL;DR
This paper highlights the critical influence of batch size on evaluating LoRA, revealing that proper tuning of batch size can reconcile conflicting results and improve the reliability of model fine-tuning assessments.
Contribution
It demonstrates that batch size is a key factor in LoRA performance evaluation and introduces a proxy-based method for efficient batch size tuning.
Findings
Proper batch size tuning aligns LoRA variants' performance.
Batch size significantly affects model fine-tuning outcomes.
A cost-efficient strategy for batch size optimization is proposed.
Abstract
Low-rank adaptation (LoRA) is a standard approach for fine-tuning large language models, yet its many variants report conflicting empirical gains, often on the same benchmarks. We show that these contradictions arise from a single overlooked factor: the batch size. When properly tuned, vanilla LoRA often matches the performance of more complex variants. We further propose a proxy-based, cost-efficient strategy for batch size tuning, revealing the impact of rank, dataset size, and model capacity on the optimal batch size. Our findings elevate batch size from a minor implementation detail to a first-order design parameter, reconciling prior inconsistencies and enabling more reliable evaluations of LoRA variants.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Machine Learning and Data Classification
