Are Large Language Models State-of-the-art Quality Estimators for   Machine Translation of User-generated Content?

Shenbin Qian; Constantin Or\u{a}san; Diptesh Kanojia; F\'elix do Carmo

arXiv:2410.06338·cs.CL·October 10, 2024

Are Large Language Models State-of-the-art Quality Estimators for Machine Translation of User-generated Content?

Shenbin Qian, Constantin Or\u{a}san, Diptesh Kanojia, F\'elix do Carmo

PDF

Open Access 1 Repo

TL;DR

This study evaluates large language models as quality estimators for machine translation of user-generated content, finding PEFT improves performance but issues like refusals and instability remain.

Contribution

It demonstrates that parameter-efficient fine-tuning enhances LLMs' ability to estimate translation quality with human-interpretable explanations for UGC.

Findings

01

PEFT improves score prediction accuracy

02

LLMs provide human-interpretable explanations

03

Challenges include refusal to reply and output instability

Abstract

This paper investigates whether large language models (LLMs) are state-of-the-art quality estimators for machine translation of user-generated content (UGC) that contains emotional expressions, without the use of reference translations. To achieve this, we employ an existing emotion-related dataset with human-annotated errors and calculate quality evaluation scores based on the Multi-dimensional Quality Metrics. We compare the accuracy of several LLMs with that of our fine-tuned baseline models, under in-context learning and parameter-efficient fine-tuning (PEFT) scenarios. We find that PEFT of LLMs leads to better performance in score prediction with human interpretable explanations than fine-tuned models. However, a manual analysis of LLM outputs reveals that they still have problems such as refusal to reply to a prompt and unstable output while evaluating machine translation of UGC.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

surrey-nlp/LLMs4MTQE-UGC
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies