Open Machine Translation for Esperanto

Ona de Gibert; Llu\'is de Gibert

arXiv:2603.29345·cs.CL·April 1, 2026

Open Machine Translation for Esperanto

Ona de Gibert, Llu\'is de Gibert

PDF

TL;DR

This paper evaluates open-source machine translation systems for Esperanto, comparing various models and providing insights into their performance through automatic and human assessments.

Contribution

It is the first comprehensive evaluation of open-source MT systems for Esperanto, including rule-based, encoder-decoder, and LLM approaches, with publicly released code and models.

Findings

01

NLLB models outperform others across all language pairs.

02

Compact models and fine-tuned LLMs perform closely to NLLB.

03

Human evaluation favors NLLB translations in about half of the cases.

Abstract

Esperanto is a widespread constructed language, known for its regular grammar and productive word formation. Besides having substantial resources available thanks to its online community, it remains relatively underexplored in the context of modern machine translation (MT) approaches. In this work, we present the first comprehensive evaluation of open-source MT systems for Esperanto, comparing rule-based systems, encoder-decoder models, and LLMs across model sizes. We evaluate translation quality across six language directions involving English, Spanish, Catalan, and Esperanto using multiple automatic metrics as well as human evaluation. Our results show that the NLLB family achieves the best performance in all language pairs, followed closely by our trained compact models and a fine-tuned general-purpose LLM. Human evaluation confirms this trend, with NLLB translations preferred in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.