Exploring Model Invariance with Discrete Search for Ultra-Low-Bit   Quantization

Yuqiao Wen; Yanshuai Cao; Lili Mou

arXiv:2502.06844·cs.LG·February 12, 2025

Exploring Model Invariance with Discrete Search for Ultra-Low-Bit Quantization

Yuqiao Wen, Yanshuai Cao, Lili Mou

PDF

Open Access

TL;DR

This paper introduces InvarExplore, a framework for ultra-low-bit quantization of large language models that leverages multiple invariances, including permutation invariance, through a discrete search algorithm, improving performance over existing methods.

Contribution

It presents a novel unified framework that systematically explores multiple model invariances, especially permutation invariance, using a discrete search for ultra-low-bit quantization.

Findings

01

Achieves performance improvements over state-of-the-art methods.

02

Compatible with existing quantization techniques.

03

Effectively explores permutation invariance with discrete search.

Abstract

Large language models have been increasing in size due to their success in a wide range of applications. This calls for a pressing need to reduce memory usage to make them more accessible. Post-training quantization is a popular technique which uses fewer bits (e.g., 4--8 bits) to represent the model without retraining it. However, it remains a challenging task to perform quantization in an ultra-low-bit setup (e.g., 2 bits). In this paper, we propose InvarExplore, a unified framework that systematically explores different model invariance at the same time, allowing us to take advantage of the synergy between each type of invariance. Importantly, InvarExplore features a discrete search algorithm that enables us to explore permutation invariance, which is under-studied as it cannot be optimized with gradient-based methods. Results show that InvarExplore is compatible with existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPhotonic and Optical Devices