Compressing Language Models for Specialized Domains
Miles Williams, George Chrysostomou, Vitor Jeronymo, Nikolaos Aletras

TL;DR
This paper introduces MixCal, a calibration method that enhances the in-domain performance of compressed language models, especially in specialized fields, while reducing computational costs compared to existing techniques.
Contribution
MixCal offers a novel post-training calibration approach that improves domain-specific performance of compressed LMs without expensive full-parameter fine-tuning.
Findings
MixCal significantly outperforms existing methods on domain-specific tasks.
It maintains general performance of language models.
It reduces computational costs of LM compression.
Abstract
Language models (LMs) excel at tasks across diverse domains, yet require substantial computational resources during inference. Compression techniques such as pruning and quantization offer a practical path towards efficient LM deployment, exemplified by their ability to preserve performance on general-purpose benchmarks. However, general-purpose LM compression methods can negatively affect performance in specialized domains (e.g. biomedical or legal). Recent work has sought to address this issue, but requires a computationally expensive full-parameter fine-tuning pipeline. To this end, we propose MixCal, a novel calibration method designed to improve the in-domain performance of compressed LMs in a post-training setting. Through extensive experimentation, we demonstrate that MixCal substantially outperforms existing approaches on domain-specific tasks and preserves general performance.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsPruning
