DBellQuant: Breaking the Bell with Double-Bell Transformation for LLMs Post Training Binarization
Zijian Ye, Wei Huang, Yifei Yu, Tianhe Ren, Zhongrui Wang, Xiaojuan Qi

TL;DR
DBellQuant introduces a novel post-training quantization framework that significantly reduces model size and computational requirements of LLMs by transforming weight distributions and smoothing activations, achieving near 1-bit weight and 6-bit activation quantization with minimal performance loss.
Contribution
It proposes the Learnable Transformation for Dual-Bell (LTDB) algorithm, a new method for transforming weight distributions to enable aggressive quantization of LLMs.
Findings
Achieves nearly 1-bit weight compression and 6-bit activation quantization.
Outperforms previous methods like BiLLM in perplexity on Wikitext2.
Maintains high model performance with minimal degradation.
Abstract
Large language models (LLMs) demonstrate remarkable performance but face substantial computational and memory challenges that limit their practical deployment. Quantization has emerged as a promising solution; however, its effectiveness is often limited by quantization errors arising from weight distributions that are not quantization-friendly and the presence of activation outliers. To address these challenges, we introduce DBellQuant, an innovative post-training quantization (PTQ) framework that achieves nearly 1-bit weight compression and 6-bit activation quantization with minimal performance degradation. DBellQuant uses Learnable Transformation for Dual-Bell (LTDB) algorithm, which transforms single-bell weight distributions into dual-bell forms to reduce binarization errors and applies inverse transformations to smooth activations. DBellQuant sets a new state-of-the-art by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Natural Language Processing Techniques
