Automated evaluation of LLMs for effective machine translation of Mandarin Chinese to English
Yue Zhang, Rodney Beard, John Hawkins, Rohitash Chandra

TL;DR
This paper presents an automated framework using semantic and sentiment analysis to evaluate the translation quality of LLMs for Mandarin Chinese to English, highlighting strengths and challenges across different text genres.
Contribution
It introduces a novel automated evaluation method combining similarity metrics and expert comparison for assessing LLM translation quality across diverse Chinese texts.
Findings
LLMs perform well in news translation
Divergence observed in literary text translation
DeepSeek excels in cultural and grammatical subtleties
Abstract
Although Large Language Models (LLMs) have exceptional performance in machine translation, only a limited systematic assessment of translation quality has been done. The challenge lies in automated frameworks, as human-expert-based evaluations can be time-consuming, given the fast-evolving LLMs and the need for a diverse set of texts to ensure fair assessments of translation quality. In this paper, we utilise an automated machine learning framework featuring semantic and sentiment analysis to assess Mandarin Chinese to English translation using Google Translate and LLMs, including GPT-4, GPT-4o, and DeepSeek. We compare original and translated texts in various classes of high-profile Chinese texts, which include novel texts that span modern and classical literature, as well as news articles. As the main evaluation measures, we utilise novel similarity metrics to compare the quality of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Computational and Text Analysis Methods · Sentiment Analysis and Opinion Mining
