ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Haoran You, Yipin Guo, Yichao Fu, Wei Zhou, Huihong Shi, Xiaofan, Zhang, Souvik Kundu, Amir Yazdanbakhsh, Yingyan Celine Lin

TL;DR
This paper introduces ShiftAddLLM, a post-training reparameterization method that replaces multiplications with shift-and-add operations in pretrained large language models, significantly reducing memory, energy, and latency while maintaining accuracy.
Contribution
It proposes a novel post-training shift-and-add reparameterization technique for pretrained LLMs, enabling efficient multiplication-free models with minimal accuracy loss.
Findings
Achieves 5.6 perplexity improvement at 3-bit quantization
Reduces memory and energy consumption by over 80%
Maintains competitive performance across five LLMs and eight tasks
Abstract
Large language models (LLMs) have shown impressive performance on language tasks but face challenges when deployed on resource-constrained devices due to their extensive parameters and reliance on dense multiplications, resulting in high memory demands and latency bottlenecks. Shift-and-add reparameterization offers a promising solution by replacing costly multiplications with hardware-friendly primitives in both the attention and multi-layer perceptron (MLP) layers of an LLM. However, current reparameterization techniques require training from scratch or full parameter fine-tuning to restore accuracy, which is resource-intensive for LLMs. To address this, we propose accelerating pretrained LLMs through post-training shift-and-add reparameterization, creating efficient multiplication-free models, dubbed ShiftAddLLM. Specifically, we quantize each weight matrix into binary matrices…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Handwritten Text Recognition Techniques · Mathematics, Computing, and Information Processing
