CATER: Leveraging LLM to Pioneer a Multidimensional, Reference-Independent Paradigm in Translation Quality Evaluation
Kurando IIDA, Kenjiro MIMURA

TL;DR
CATER introduces a prompt-driven, reference-independent framework leveraging LLMs for multidimensional translation quality evaluation, capturing nuanced errors and enabling instant, adaptable assessments across languages and genres.
Contribution
The paper presents CATER, a novel LLM-based evaluation framework that eliminates the need for references, allowing flexible, immediate, and comprehensive translation quality assessment.
Findings
Supports multilingual and genre-diverse evaluations
Captures subtle errors like hallucinations and omissions
Enables instant, customizable scoring without pre-existing references
Abstract
This paper introduces the Comprehensive AI-assisted Translation Edit Ratio (CATER), a novel and fully prompt-driven framework for evaluating machine translation (MT) quality. Leveraging large language models (LLMs) via a carefully designed prompt-based protocol, CATER expands beyond traditional reference-bound metrics, offering a multidimensional, reference-independent evaluation that addresses linguistic accuracy, semantic fidelity, contextual coherence, stylistic appropriateness, and information completeness. CATER's unique advantage lies in its immediate implementability: by providing the source and target texts along with a standardized prompt, an LLM can rapidly identify errors, quantify edit effort, and produce category-level and overall scores. This approach eliminates the need for pre-computed references or domain-specific resources, enabling instant adaptation to diverse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
