Exploiting LLM Quantization

Kazuki Egashira; Mark Vero; Robin Staab; Jingxuan He; Martin Vechev

arXiv:2405.18137·cs.LG·November 5, 2024·2 cites

Exploiting LLM Quantization

Kazuki Egashira, Mark Vero, Robin Staab, Jingxuan He, Martin Vechev

PDF

Open Access 1 Repo

TL;DR

This paper uncovers a security vulnerability in LLM quantization, showing how malicious models can be hidden in benign full-precision models and only become harmful after quantization, posing a significant threat to deployment safety.

Contribution

It introduces the first study of security risks in LLM quantization, demonstrating a novel attack framework that exploits quantization constraints to embed malicious behavior.

Findings

01

Demonstrates feasibility of malicious quantized LLMs in three scenarios

02

Shows full-precision models can appear benign but become harmful after quantization

03

Highlights potential security risks in deploying quantized LLMs on public platforms

Abstract

Quantization leverages lower-precision weights to reduce the memory usage of large language models (LLMs) and is a key technique for enabling their deployment on commodity hardware. While LLM quantization's impact on utility has been extensively explored, this work for the first time studies its adverse effects from a security perspective. We reveal that widely used quantization methods can be exploited to produce a harmful quantized LLM, even though the full-precision counterpart appears benign, potentially tricking users into deploying the malicious quantized model. We demonstrate this threat using a three-staged attack framework: (i) first, we obtain a malicious LLM through fine-tuning on an adversarial task; (ii) next, we quantize the malicious model and calculate constraints that characterize all full-precision models that map to the same quantized model; (iii) finally, using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

eth-sri/llm-quantization-attack
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMagnetic confinement fusion research · Advanced Data Storage Technologies · Reservoir Engineering and Simulation Methods