MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter

Jitai Hao; WeiWei Sun; Xin Xin; Qi Meng; Zhumin Chen; Pengjie Ren,; Zhaochun Ren

arXiv:2406.04984·cs.CL·June 10, 2024·1 cites

MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter

Jitai Hao, WeiWei Sun, Xin Xin, Qi Meng, Zhumin Chen, Pengjie Ren,, Zhaochun Ren

PDF

Open Access 1 Repo 1 Video

TL;DR

MEFT introduces a memory-efficient fine-tuning method for large language models that leverages activation sparsity and CPU memory to enable larger adapters without requiring extensive GPU resources.

Contribution

The paper proposes a novel approach to fine-tune LLMs with larger adapters by exploiting activation sparsity and CPU memory, improving performance under limited GPU resources.

Findings

01

Achieves comparable fine-tuning results with limited GPU memory

02

Utilizes CPU memory and activation sparsity for efficient adapter training

03

Reduces communication overhead with a Mixture of Experts architecture

Abstract

Parameter-Efficient Fine-tuning (PEFT) facilitates the fine-tuning of Large Language Models (LLMs) under limited resources. However, the fine-tuning performance with PEFT on complex, knowledge-intensive tasks is limited due to the constrained model capacity, which originates from the limited number of additional trainable parameters. To overcome this limitation, we introduce a novel mechanism that fine-tunes LLMs with adapters of larger size yet memory-efficient. This is achieved by leveraging the inherent activation sparsity in the Feed-Forward Networks (FFNs) of LLMs and utilizing the larger capacity of Central Processing Unit (CPU) memory compared to Graphics Processing Unit (GPU). We store and update the parameters of larger adapters on the CPU. Moreover, we employ a Mixture of Experts (MoE)-like architecture to mitigate unnecessary CPU computations and reduce the communication…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

currentf/meft
pytorchOfficial

Videos

MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter· underline

Taxonomy

TopicsNeural Networks and Reservoir Computing · Photonic and Optical Devices · Advanced Memory and Neural Computing