Mind the Gap: A Practical Attack on GGUF Quantization

Kazuki Egashira; Robin Staab; Mark Vero; Jingxuan He; Martin Vechev

arXiv:2505.23786·cs.CR·June 5, 2025

Mind the Gap: A Practical Attack on GGUF Quantization

Kazuki Egashira, Robin Staab, Mark Vero, Jingxuan He, Martin Vechev

PDF

Open Access 1 Repo 10 Models

TL;DR

This paper introduces the first attack on GGUF quantization used in large language models, exploiting quantization errors to embed malicious behaviors while maintaining benign appearance in full precision.

Contribution

It presents a novel attack method on complex GGUF quantization schemes, demonstrating vulnerabilities in widely used post-training quantization for LLMs.

Findings

01

Effective attack on three popular LLMs across nine GGUF data types

02

High success rates in insecure code generation and content injection scenarios

03

Shows that complex quantization schemes are not inherently secure against adversarial attacks

Abstract

With the increasing size of frontier LLMs, post-training quantization has become the standard for memory-efficient deployment. Recent work has shown that basic rounding-based quantization schemes pose security risks, as they can be exploited to inject malicious behaviors into quantized models that remain hidden in full precision. However, existing attacks cannot be applied to more complex quantization methods, such as the GGUF family used in the popular ollama and llama $.$ cpp frameworks. In this work, we address this gap by introducing the first attack on GGUF. Our key insight is that the quantization error -- the difference between the full-precision weights and their (de-)quantized version -- provides sufficient flexibility to construct malicious quantized models that appear benign in full precision. Leveraging this, we develop an attack that trains the target malicious LLM while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

eth-sri/llm-quantization-attack
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Security and Verification in Computing · Physical Unclonable Functions (PUFs) and Hardware Security