Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder

TL;DR
Compacter is a parameter-efficient fine-tuning method for large language models that uses hypercomplex layers and Kronecker products to achieve performance comparable to full fine-tuning while training only a tiny fraction of parameters.
Contribution
It introduces a novel hypercomplex adapter layer that significantly reduces trainable parameters while maintaining or improving task performance.
Findings
Compacter trains only 0.047% of parameters and matches or exceeds standard fine-tuning performance.
It outperforms existing parameter-efficient methods on GLUE and SuperGLUE benchmarks.
The method is effective in low-resource settings.
Abstract
Adapting large-scale pretrained language models to downstream tasks via fine-tuning is the standard method for achieving state-of-the-art performance on NLP benchmarks. However, fine-tuning all weights of models with millions or billions of parameters is sample-inefficient, unstable in low-resource settings, and wasteful as it requires storing a separate copy of the model for each task. Recent work has developed parameter-efficient fine-tuning methods, but these approaches either still require a relatively large number of parameters or underperform standard fine-tuning. In this work, we propose Compacter, a method for fine-tuning large-scale language models with a better trade-off between task performance and the number of trainable parameters than prior work. Compacter accomplishes this by building on top of ideas from adapters, low-rank optimization, and parameterized hypercomplex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗CrisisNarratives/adapter-8classes-multi_labelmodel· 3 dl3 dl
- 🤗CrisisNarratives/adapter-13classes-single_labelmodel· 2 dl2 dl
- 🤗CrisisNarratives/adapter-8classes-single_labelmodel· 1 dl1 dl
- 🤗CrisisNarratives/adapter-9classes-single_labelmodel· 1 dl1 dl
- 🤗CrisisNarratives/adapter-9classes-multi_labelmodel· 1 dl1 dl
- 🤗CrisisNarratives/adapter-13classes-multi_labelmodel· 1 dl1 dl
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning
