Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model
Jakub Prejzner

TL;DR
This paper systematically evaluates extreme 2-bit quantization methods on a Polish large language model, comparing their performance, efficiency, and behavior in autoregressive tasks, with insights into their practical trade-offs.
Contribution
It provides the first comprehensive comparison of six state-of-the-art 2-bit quantization methods on a Polish LLM, highlighting their relative performance and unique behaviors.
Findings
QuIP# E8P12 achieves near-baseline accuracy with minimal size increase.
QTIP offers the best bit efficiency, matching larger models at smaller size.
Rotation-based methods preserve log-likelihood but fail in autoregressive generation.
Abstract
We present Bielik-Q2-Sharp, the first systematic academic evaluation of extreme 2-bit quantization applied to a Polish large language model. Using Bielik-11B-v2.3-Instruct (11B parameters, Mistral architecture) as our base model, we compare six state-of-the-art post-training quantization methods -- QuIP#, SpinQuant+GPTQ, ButterflyQuant, QTIP, VPTQ, and AQLM -- all calibrated on a Polish-language corpus (CulturaX-PL) with shared Hessian matrices. Our best variant (QuIP# E8P12) achieves 71.92% across 22 Polish benchmarks versus 72.07% for the IQ2_XXS baseline -- within statistical noise, at a modest size premium (3.26 GB vs. ~2.6 GB). On eq_bench, our method scores 47.14 versus 43.53 (+3.6pp), suggesting superior preservation of higher-order reasoning. QTIP achieves the best per-bit efficiency (79.4% MC acc_norm at ~2.4 bpw, 3.27 GB), matching VPTQ's quality at 35% smaller size. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
