Calibrating Beyond English: Language Diversity for Better Quantized Multilingual LLM

Everlyn Asiko Chimoto; Mostafa Elhoushi; Bruce A. Bassett

arXiv:2601.18306·cs.CL·January 27, 2026

Calibrating Beyond English: Language Diversity for Better Quantized Multilingual LLM

Everlyn Asiko Chimoto, Mostafa Elhoushi, Bruce A. Bassett

PDF

Open Access

TL;DR

This paper demonstrates that using diverse, multilingual calibration data significantly improves the performance of quantized multilingual large language models, emphasizing the importance of language-specific calibration for better results.

Contribution

It systematically evaluates calibration strategies across multiple languages and quantizers, revealing the benefits of multilingual and language-tailored calibration sets for quantized LLMs.

Findings

01

Multilingual calibration sets reduce perplexity more than English-only sets.

02

Tailoring calibration to the evaluation language yields the best performance.

03

Certain language-quantizer combinations can degrade performance due to activation range differences.

Abstract

Quantization is an effective technique for reducing the storage footprint and computational costs of Large Language Models (LLMs), but it often results in performance degradation. Existing post-training quantization methods typically use small, English-only calibration sets; however, their impact on multilingual models remains underexplored. We systematically evaluate eight calibration settings (five single-language and three multilingual mixes) on two quantizers (GPTQ, AWQ) on data from 10 languages. Our findings reveal a consistent trend: non-English and multilingual calibration sets significantly improve perplexity compared to English-only baselines. Specifically, we observe notable average perplexity gains across both quantizers on Llama3.1 8B and Qwen2.5 7B, with multilingual mixes achieving the largest overall reductions of up to 3.52 points in perplexity. Furthermore, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification