Block Circulant Adapter for Large Language Models
Xinyu Ding, Meiqi Wang, Siyu Liao, Zhongfeng Wang

TL;DR
This paper introduces a block circulant adapter method for fine-tuning large language models, significantly reducing parameter count and computational costs while maintaining high performance, by leveraging Fourier transforms and circulant matrices.
Contribution
It proposes a novel block circulant matrix-based fine-tuning approach with a stable heuristic, enabling efficient adaptation of large models in the frequency domain.
Findings
14x fewer parameters than VeRA
16x smaller than LoRA
32x less FLOPs than FourierFT
Abstract
Fine-tuning large language models (LLMs) is difficult due to their huge model size. Recent Fourier domain-based methods show potential for reducing fine-tuning costs. We propose a block circulant matrix-based fine-tuning method with a stable training heuristic to leverage the properties of circulant matrices and one-dimensional Fourier transforms to reduce storage and computation costs. Experiments show that our method uses less number of parameters than VeRA, smaller than LoRA and less FLOPs than FourierFT, while maintaining close or better task performance. Our approach presents a promising way in frequency domain to fine-tune large models on downstream tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
