Beyond Perplexity: Multi-dimensional Safety Evaluation of LLM Compression
Zhichao Xu, Ashim Gupta, Tao Li, Oliver Bentham, Vivek Srikumar

TL;DR
This paper systematically evaluates how different model compression techniques affect various safety aspects of large language models, revealing complex impacts on bias, toxicity, and task performance beyond traditional perplexity measures.
Contribution
It introduces a multi-dimensional safety assessment framework for compressed LLMs, highlighting the diverse and sometimes unexpected effects of compression methods on safety and bias.
Findings
Compression can reduce degeneration harm but may increase representational harm.
Different compression methods have distinct safety impacts, e.g., quantization vs. pruning.
Safety impacts vary across protected groups and bias types.
Abstract
Increasingly, model compression techniques enable large language models (LLMs) to be deployed in real-world applications. As a result of this momentum towards local deployment, compressed LLMs will interact with a large population. Prior work on compression typically prioritize preserving perplexity, which is directly analogous to training loss. The impact of compression method on other critical aspects of model behavior\, -- \,particularly safety\, -- \,requires systematic assessment. To this end, we investigate the impact of model compression along four dimensions: (1) degeneration harm, i.e., bias and toxicity in generation; (2) representational harm, i.e., biases in discriminative tasks; (3) dialect bias; and(4) language modeling and downstream task performance. We examine a wide spectrum of LLM compression techniques, including unstructured pruning, semi-structured pruning, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvancements in Semiconductor Devices and Circuit Design · Advancements in Photolithography Techniques · Electrostatic Discharge in Electronics
MethodsPruning
