Calibrating Beyond English: Language Diversity for Better Quantized Multilingual LLM
Everlyn Asiko Chimoto, Mostafa Elhoushi, Bruce A. Bassett

TL;DR
This paper demonstrates that using diverse, multilingual calibration data significantly improves the performance of quantized multilingual large language models, emphasizing the importance of language-specific calibration for better results.
Contribution
It systematically evaluates calibration strategies across multiple languages and quantizers, revealing the benefits of multilingual and language-tailored calibration sets for quantized LLMs.
Findings
Multilingual calibration sets reduce perplexity more than English-only sets.
Tailoring calibration to the evaluation language yields the best performance.
Certain language-quantizer combinations can degrade performance due to activation range differences.
Abstract
Quantization is an effective technique for reducing the storage footprint and computational costs of Large Language Models (LLMs), but it often results in performance degradation. Existing post-training quantization methods typically use small, English-only calibration sets; however, their impact on multilingual models remains underexplored. We systematically evaluate eight calibration settings (five single-language and three multilingual mixes) on two quantizers (GPTQ, AWQ) on data from 10 languages. Our findings reveal a consistent trend: non-English and multilingual calibration sets significantly improve perplexity compared to English-only baselines. Specifically, we observe notable average perplexity gains across both quantizers on Llama3.1 8B and Qwen2.5 7B, with multilingual mixes achieving the largest overall reductions of up to 3.52 points in perplexity. Furthermore, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification
