Lost in the Source Language: How Large Language Models Evaluate the   Quality of Machine Translation

Xu Huang; Zhirui Zhang; Xiang Geng; Yichao Du; Jiajun Chen; Shujian; Huang

arXiv:2401.06568·cs.CL·June 7, 2024·1 cites

Lost in the Source Language: How Large Language Models Evaluate the Quality of Machine Translation

Xu Huang, Zhirui Zhang, Xiang Geng, Yichao Du, Jiajun Chen, Shujian, Huang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper explores how Large Language Models evaluate machine translation quality, revealing that reference data improves accuracy while source data can sometimes hinder performance, highlighting areas for future research.

Contribution

It provides a detailed analysis of LLMs' evaluation mechanisms, showing their limited use of source information and suggesting directions for enhancing cross-lingual evaluation capabilities.

Findings

01

Reference information improves evaluation accuracy

02

Source information can be counterproductive

03

LLMs' cross-lingual capabilities are underutilized

Abstract

This study investigates how Large Language Models (LLMs) leverage source and reference data in machine translation evaluation task, aiming to better understand the mechanisms behind their remarkable performance in this task. We design the controlled experiments across various input modes and model types, and employ both coarse-grained and fine-grained prompts to discern the utility of source versus reference information. We find that reference information significantly enhances the evaluation accuracy, while surprisingly, source information sometimes is counterproductive, indicating LLMs' inability to fully leverage the cross-lingual capability when evaluating translations. Further analysis of the fine-grained evaluation and fine-tuning experiments show similar results. These findings also suggest a potential research direction for LLMs that fully exploits the cross-lingual capability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xuuhuang/lost_in_the_src
pytorchOfficial

Videos

Lost in the Source Language: How Large Language Models Evaluate the Quality of Machine Translation· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification