Quantization Undoes Alignment: Bias Emergence in Compressed LLMs Across Models and Precision Levels

Plawan Kumar Rath; Rahul Maliakkal

arXiv:2605.15208·cs.LG·May 18, 2026

Quantization Undoes Alignment: Bias Emergence in Compressed LLMs Across Models and Precision Levels

Plawan Kumar Rath, Rahul Maliakkal

PDF

15 Models

TL;DR

This study reveals that aggressive quantization of large language models can induce new biases and stereotypical behaviors unnoticed by standard metrics, highlighting the need for bias-aware compression methods.

Contribution

It provides a comprehensive empirical analysis of bias emergence in quantized LLMs across multiple models, precision levels, and bias benchmarks, revealing hidden fairness risks.

Findings

01

3-bit quantization causes 6-21% of unbiased items to develop biases

02

Perplexity metrics fail to detect bias emergence at lower precisions

03

A significant portion of items develop biases at 4-bit quantization despite minimal perplexity increase

Abstract

Large Language Models are routinely compressed via post-training quantization to reduce inference costs and memory footprint for cloud and edge deployment, yet the impact of this compression on model quality remains poorly understood. Existing studies typically compare only two conditions (full-precision vs. a single quantized variant), rely on aggregate bias metrics, and evaluate a single model family, making it impossible to distinguish gradual degradation from threshold-dependent safety failures. We conduct a controlled empirical study of three instruction-tuned models (Qwen2.5-7B, Mistral-7B, Phi-3.5-mini) at five precision levels (BF16 through 3-bit) on 12,148 BBQ bias benchmark items across 5 random seeds, totaling 911,100 inference records. Our results reveal that 3-bit quantization causes 6-21% of previously unbiased items to develop new stereotypical behaviors, following a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.