CompactQE: Interpretable Translation Quality Estimation via Small Open-Weight LLMs
Kamil Guttmann, Zofia Fra\'s, Artur Nowakowski, Krzysztof Jassem

TL;DR
CompactQE introduces small open-source LLMs that provide interpretable, high-quality translation assessments and error analyses, rivaling larger proprietary models while ensuring privacy and cost-effectiveness.
Contribution
The paper demonstrates that small open-weight LLMs can effectively perform translation quality estimation and error analysis using a single-pass prompting approach, challenging reliance on large proprietary models.
Findings
Small open-source LLMs achieve system-level correlations surpassing traditional metrics.
Single-pass prompting enables simultaneous quality scoring and error annotation.
Models approximate capabilities of larger proprietary LLMs with high accuracy.
Abstract
Current state-of-the-art Quality Estimation (QE) in machine translation relies on massive, proprietary LLMs, raising data privacy concerns. We demonstrate that smaller, open-source LLMs (<30B parameters) are a viable, cost-effective and privacy-preserving alternative. Using a single-pass prompting strategy, our models simultaneously generate quality scores, MQM error annotations, suggested error corrections, and full post-editions. Our analysis shows these models achieve highly competitive system-level correlations with human judgments that outperform traditional neural metrics, fine-tuned models, and human inter-annotator agreement, effectively approximating the capabilities of much larger proprietary LLMs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
