When LLMs Struggle: Reference-less Translation Evaluation for   Low-resource Languages

Archchana Sindhujan; Diptesh Kanojia; Constantin Orasan; Shenbin Qian

arXiv:2501.04473·cs.CL·January 9, 2025

When LLMs Struggle: Reference-less Translation Evaluation for Low-resource Languages

Archchana Sindhujan, Diptesh Kanojia, Constantin Orasan, Shenbin Qian

PDF

Open Access

TL;DR

This paper explores reference-less machine translation quality estimation for low-resource languages, comparing large language models and fine-tuned models, and highlights the need for better cross-lingual pre-training.

Contribution

It introduces a novel prompt-based approach and provides a comprehensive evaluation of LLMs versus fine-tuned models for low-resource language QE.

Findings

01

Fine-tuned QE models outperform prompt-based LLM approaches.

02

Tokenization, transliteration, and named entity errors are major challenges.

03

Public release of data and models supports further research.

Abstract

This paper investigates the reference-less evaluation of machine translation for low-resource language pairs, known as quality estimation (QE). Segment-level QE is a challenging cross-lingual language understanding task that provides a quality score (0-100) to the translated output. We comprehensively evaluate large language models (LLMs) in zero/few-shot scenarios and perform instruction fine-tuning using a novel prompt based on annotation guidelines. Our results indicate that prompt-based approaches are outperformed by the encoder-based fine-tuned QE models. Our error analysis reveals tokenization issues, along with errors due to transliteration and named entities, and argues for refinement in LLM pre-training for cross-lingual tasks. We release the data, and models trained publicly for further research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques