BlockLLM: Memory-Efficient Adaptation of LLMs by Selecting and   Optimizing the Right Coordinate Blocks

Amrutha Varshini Ramesh; Vignesh Ganapathiraman; Issam H. Laradji,; Mark Schmidt

arXiv:2406.17296·cs.LG·December 17, 2024

BlockLLM: Memory-Efficient Adaptation of LLMs by Selecting and Optimizing the Right Coordinate Blocks

Amrutha Varshini Ramesh, Vignesh Ganapathiraman, Issam H. Laradji,, Mark Schmidt

PDF

Open Access 1 Repo

TL;DR

BlockLLM introduces a memory-efficient method for large language model adaptation by selecting and optimizing a small subset of parameters, achieving state-of-the-art results with significantly reduced memory usage.

Contribution

It proposes a novel block coordinate descent-based approach that selectively updates a small subset of parameters without altering model architecture or training procedures.

Findings

01

Achieves state-of-the-art perplexity on GLUE benchmarks with less than 5% parameter updates.

02

Reduces memory footprint significantly during training of large models.

03

Maintains competitive performance on pretrained Llama models with reduced memory requirements.

Abstract

Training large language models (LLMs) for pretraining or adapting to new tasks and domains has become increasingly critical as their applications expand. However, as the model and the data sizes grow, the training process presents significant memory challenges, often requiring a prohibitive amount of GPU memory that may not be readily available. Existing methods such as low-rank adaptation (LoRA) add trainable low-rank matrix factorizations, altering the training dynamics and limiting the model's parameter search to a low-rank subspace. GaLore, a more recent method, employs Gradient Low-Rank Projection to reduce the memory footprint, in the full parameter training setting. However GaLore can only be applied to a subset of the LLM layers that satisfy the "reversibility" property, thus limiting their applicability. In response to these challenges, we introduce BlockLLM, an approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RAmruthaVignesh/blockllm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Advanced Data Storage Technologies

MethodsLLaMA