MTQE.en-he: Machine Translation Quality Estimation for English-Hebrew

Andy Rosenbaum; Assaf Siani; Ilan Kernerman

arXiv:2602.06546·cs.CL·February 9, 2026

MTQE.en-he: Machine Translation Quality Estimation for English-Hebrew

Andy Rosenbaum, Assaf Siani, Ilan Kernerman

PDF

Open Access 1 Video

TL;DR

This paper introduces MTQE.en-he, the first English-Hebrew translation quality estimation benchmark, and evaluates multiple models, showing ensemble and parameter-efficient fine-tuning methods improve performance for this under-resourced language pair.

Contribution

It provides the first publicly available English-Hebrew MT quality estimation benchmark and systematically evaluates model ensembling and efficient fine-tuning techniques.

Findings

01

Ensembling ChatGPT, TransQuest, and CometKiwi outperforms individual models.

02

Parameter-efficient fine-tuning methods improve performance by 2-3 percentage points.

03

Full-model fine-tuning is sensitive to overfitting and distribution collapse.

Abstract

We release MTQE.en-he: to our knowledge, the first publicly available English-Hebrew benchmark for Machine Translation Quality Estimation. MTQE.en-he contains 959 English segments from WMT24++, each paired with a machine translation into Hebrew, and Direct Assessment scores of the translation quality annotated by three human experts. We benchmark ChatGPT prompting, TransQuest, and CometKiwi and show that ensembling the three models outperforms the best single model (CometKiwi) by 6.4 percentage points Pearson and 5.6 percentage points Spearman. Fine-tuning experiments with TransQuest and CometKiwi reveal that full-model updates are sensitive to overfitting and distribution collapse, yet parameter-efficient methods (LoRA, BitFit, and FTHead, i.e., fine-tuning only the classification head) train stably and yield improvements of 2-3 percentage points. MTQE.en-he and our experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

MTQE.en-he: Machine Translation Quality Estimation for English-Hebrew· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Data Classification