Compressing Language Models for Specialized Domains

Miles Williams; George Chrysostomou; Vitor Jeronymo; Nikolaos Aletras

arXiv:2502.18424·cs.CL·February 26, 2026

Compressing Language Models for Specialized Domains

Miles Williams, George Chrysostomou, Vitor Jeronymo, Nikolaos Aletras

PDF

Open Access 1 Video

TL;DR

This paper introduces MixCal, a calibration method that enhances the in-domain performance of compressed language models, especially in specialized fields, while reducing computational costs compared to existing techniques.

Contribution

MixCal offers a novel post-training calibration approach that improves domain-specific performance of compressed LMs without expensive full-parameter fine-tuning.

Findings

01

MixCal significantly outperforms existing methods on domain-specific tasks.

02

It maintains general performance of language models.

03

It reduces computational costs of LM compression.

Abstract

Language models (LMs) excel at tasks across diverse domains, yet require substantial computational resources during inference. Compression techniques such as pruning and quantization offer a practical path towards efficient LM deployment, exemplified by their ability to preserve performance on general-purpose benchmarks. However, general-purpose LM compression methods can negatively affect performance in specialized domains (e.g. biomedical or legal). Recent work has sought to address this issue, but requires a computationally expensive full-parameter fine-tuning pipeline. To this end, we propose MixCal, a novel calibration method designed to improve the in-domain performance of compressed LMs in a post-training setting. Through extensive experimentation, we demonstrate that MixCal substantially outperforms existing approaches on domain-specific tasks and preserves general performance.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Compressing Language Models for Specialized Domains· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsPruning