LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional   Adaptation

Seyedarmin Azizi; Souvik Kundu; Massoud Pedram

arXiv:2406.12832·cs.CL·June 19, 2024

LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation

Seyedarmin Azizi, Souvik Kundu, Massoud Pedram

PDF

Open Access 1 Repo 1 Video

TL;DR

LaMDA introduces a spectrally decomposed low-dimensional adaptation method for fine-tuning large language models, significantly reducing trainable parameters and GPU memory while maintaining or improving performance across various NLP tasks.

Contribution

The paper presents LaMDA, a novel fine-tuning approach that leverages spectral decomposition to reduce parameter updates and memory usage, with an enhanced version LaMDA++ incorporating adaptive rank allocation.

Findings

01

Achieves up to 17.7x fewer parameter updates.

02

Reduces peak GPU memory usage by up to 1.32x.

03

Matches or surpasses existing fine-tuning methods in performance.

Abstract

Low-rank adaptation (LoRA) has become the default approach to fine-tune large language models (LLMs) due to its significant reduction in trainable parameters. However, trainable parameter demand for LoRA increases with increasing model embedding dimensions, leading to high compute costs. Additionally, its backward updates require storing high-dimensional intermediate activations and optimizer states, demanding high peak GPU memory. In this paper, we introduce large model fine-tuning via spectrally decomposed low-dimensional adaptation (LaMDA), a novel approach to fine-tuning large language models, which leverages low-dimensional adaptation to achieve significant reductions in trainable parameters and peak GPU memory footprint. LaMDA freezes a first projection matrix (PMA) in the adaptation path while introducing a low-dimensional trainable square matrix, resulting in substantial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

arminazizi98/lamda
pytorchOfficial

Videos

LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation· underline

Taxonomy

TopicsNeural Networks and Applications · Medical Image Segmentation Techniques · Machine Learning and ELM