MT-LENS: An all-in-one Toolkit for Better Machine Translation Evaluation

Javier Garc\'ia Gilabert; Carlos Escolano; Audrey Mash; Xixian Liao,; Maite Melero

arXiv:2412.11615·cs.CL·December 17, 2024

MT-LENS: An all-in-one Toolkit for Better Machine Translation Evaluation

Javier Garc\'ia Gilabert, Carlos Escolano, Audrey Mash, Xixian Liao,, Maite Melero

PDF

Open Access 1 Repo

TL;DR

MT-LENS is a comprehensive toolkit that enhances the evaluation of machine translation systems by covering quality, bias, toxicity, and robustness, with interactive visualization and support for diverse datasets.

Contribution

It extends existing evaluation frameworks to include multiple aspects of MT performance, offering a unified, user-friendly platform for thorough assessment.

Findings

01

Supports diverse evaluation metrics and datasets

02

Enables analysis of biases and robustness

03

Provides interactive visualization tools

Abstract

We introduce MT-LENS, a framework designed to evaluate Machine Translation (MT) systems across a variety of tasks, including translation quality, gender bias detection, added toxicity, and robustness to misspellings. While several toolkits have become very popular for benchmarking the capabilities of Large Language Models (LLMs), existing evaluation tools often lack the ability to thoroughly assess the diverse aspects of MT performance. MT-LENS addresses these limitations by extending the capabilities of LM-eval-harness for MT, supporting state-of-the-art datasets and a wide range of evaluation metrics. It also offers a user-friendly platform to compare systems and analyze translations with interactive visualizations. MT-LENS aims to broaden access to evaluation strategies that go beyond traditional translation quality evaluation, enabling researchers and engineers to better understand…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

langtech-bsc/mt-evaluation
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques