A Systematic Study of Compression Ordering for Large Language Models
Shivansh Chhawri, Rahul Mahadik, Suparna Rooj

TL;DR
This paper systematically investigates how the order of applying compression techniques like pruning, distillation, and quantization affects the performance and compression ratio of large language models, providing practical guidelines for efficient deployment.
Contribution
It introduces a comprehensive analysis of compression technique sequences for LLMs, highlighting the optimal order for balancing compression and performance.
Findings
Quantization achieves the highest standalone compression.
Pruning causes moderate quality degradation.
Pruning, distillation, then quantization (P-KD-Q) yields the best balance.
Abstract
Large Language Models (LLMs) require substantial computational resources, making model compression essential for efficient deployment in constrained environments. Among the dominant compression techniques: knowledge distillation, structured pruning, and low-bit quantization, their individual effects are well studied, but their interactions and optimal sequencing remain unclear. This work systematically examines how these techniques perform both independently and in combination when applied to the Qwen2.5 3B model. We evaluate multiple compression pipelines, including single, and proposed three-technique sequences, using perplexity, G-Eval, clarity, prompt alignment, and compression ratio as metrics. Our experiments show that quantization provides the greatest standalone compression, while pruning introduces moderate quality degradation. Critically, the ordering of techniques…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Artificial Intelligence in Healthcare and Education
