Towards Fine-Grained Information: Identifying the Type and Location of   Translation Errors

Keqin Bao; Yu Wan; Dayiheng Liu; Baosong Yang; Wenqiang Lei; Xiangnan; He; Derek F.Wong; Jun Xie

arXiv:2302.08975·cs.CL·February 20, 2023·1 cites

Towards Fine-Grained Information: Identifying the Type and Location of Translation Errors

Keqin Bao, Yu Wan, Dayiheng Liu, Baosong Yang, Wenqiang Lei, Xiangnan, He, Derek F.Wong, Jun Xie

PDF

Open Access

TL;DR

This paper introduces FG-TED, a model for detecting translation error types and locations simultaneously, improving accuracy and reliability especially in low-resource scenarios.

Contribution

The paper proposes a novel FG-TED task and model that identify both error type and position, addressing limitations of previous methods.

Findings

01

Achieves state-of-the-art results on the benchmark dataset.

02

Provides more reliable predictions in low-resource and transfer scenarios.

03

Constructs synthetic datasets to improve training and reduce labeling disagreements.

Abstract

Fine-grained information on translation errors is helpful for the translation evaluation community. Existing approaches can not synchronously consider error position and type, failing to integrate the error information of both. In this paper, we propose Fine-Grained Translation Error Detection (FG-TED) task, aiming at identifying both the position and the type of translation errors on given source-hypothesis sentence pairs. Besides, we build an FG-TED model to predict the \textbf{addition} and \textbf{omission} errors -- two typical translation accuracy errors. First, we use a word-level classification paradigm to form our model and use the shortcut learning reduction to relieve the influence of monolingual features. Besides, we construct synthetic datasets for model training, and relieve the disagreement of data labeling in authoritative datasets, making the experimental benchmark…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification