Preliminary Ranking of WMT25 General Machine Translation Systems
Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ond\v{r}ej Bojar, Konstantin Dranch, Anton Dvorkovich, Sergey Dukanov, Natalia Fedorova, Mark Fishel, Markus Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow, Marzena Karpinska, Philipp Koehn, Howard Lakougna

TL;DR
This paper provides preliminary automatic rankings of WMT25 machine translation systems, highlighting potential biases and emphasizing that human evaluation will ultimately determine the official rankings.
Contribution
It offers early automatic evaluation results for WMT25 MT systems, aiding participants before official human-based rankings are released.
Findings
Preliminary rankings based on automatic metrics.
Bias towards re-ranking techniques in automatic evaluation.
Official rankings will rely on human evaluation.
Abstract
We present the preliminary rankings of machine translation (MT) systems submitted to the WMT25 General Machine Translation Shared Task, as determined by automatic evaluation metrics. Because these rankings are derived from automatic evaluation, they may exhibit a bias toward systems that employ re-ranking techniques, such as Quality Estimation or Minimum Bayes Risk decoding. The official WMT25 ranking will be based on human evaluation, which is more reliable and will supersede these results. The official WMT25 ranking will be based on human evaluation, which is more reliable and will supersede these results. The purpose of releasing these findings now is to assist task participants with their system description papers; not to provide final findings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification · Speech and dialogue systems
