Hyperparameter Optimization for Large Language Model Instruction-Tuning

Christophe Tribes; Sacha Benarroch-Lelong; Peng Lu; Ivan Kobyzev

arXiv:2312.00949·cs.CL·February 1, 2024·2 cites

Hyperparameter Optimization for Large Language Model Instruction-Tuning

Christophe Tribes, Sacha Benarroch-Lelong, Peng Lu, Ivan Kobyzev

PDF

Open Access

TL;DR

This paper explores hyperparameter optimization for efficient fine-tuning of large language models using blackbox optimization techniques, specifically improving performance and alignment through the Nomad algorithm.

Contribution

It introduces the application of blackbox optimization, particularly the Nomad algorithm, to tune hyperparameters in LoRA-based fine-tuning of LLMs, enhancing downstream task performance.

Findings

01

Hyperparameter tuning with Nomad improves model performance.

02

Optimized hyperparameters lead to better human alignment.

03

Efficient exploration of hyperparameter space enhances fine-tuning results.

Abstract

The fine-tuning of Large Language Models (LLMs) has enabled them to recently achieve milestones in natural language processing applications. The emergence of ever larger LLMs has paved the way for more efficient fine-tuning methods. Among these, the Low-Rank Adaptation (LoRA) method keeps most of the weights of the pre-trained LLM frozen while introducing a low-rank decomposition of the weight matrix, enabling the tuning of only a very small proportion of the network. The performance on downstream tasks of models fine-tuned with LoRA heavily relies on a set of hyperparameters including the rank of the decomposition. In this work, we investigate the choice of these hyperparameters through two main blackbox optimization (BBO) techniques. We examine the whole pipeline of performing fine-tuning and validation on a pre-trained LLM as a blackbox and efficiently explore the space of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification

MethodsSparse Evolutionary Training