Repairing Pronouns in Translation with BERT-Based Post-Editing
Reid Pryzant

TL;DR
This paper addresses pronoun translation errors in neural machine translation, demonstrating their impact on quality and proposing a BERT-based post-editing method that improves translation accuracy for pronouns, especially in Japanese-English translation.
Contribution
It introduces a novel BERT-based post-editing approach for correcting pronoun errors in NMT outputs, focusing on fine-tuning BERT for pronoun prediction to enhance translation quality.
Findings
Pronoun errors can account for over half of NMT errors in some domains.
Pronouns significantly influence perceived translation quality.
The proposed method improves translation accuracy for pronouns in Japanese-English translations.
Abstract
Pronouns are important determinants of a text's meaning but difficult to translate. This is because pronoun choice can depend on entities described in previous sentences, and in some languages pronouns may be dropped when the referent is inferrable from the context. These issues can lead Neural Machine Translation (NMT) systems to make critical errors on pronouns that impair intelligibility and even reinforce gender bias. We investigate the severity of this pronoun issue, showing that (1) in some domains, pronoun choice can account for more than half of a NMT systems' errors, and (2) pronouns have a disproportionately large impact on perceived translation quality. We then investigate a possible solution: fine-tuning BERT on a pronoun prediction task using chunks of source-side sentences, then using the resulting classifier to repair the translations of an existing NMT model. We offer an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsRepair · Linear Layer · Residual Connection · Refunds@Expedia|||How do I get a full refund from Expedia? · Weight Decay · WordPiece · Adam · Dense Connections · Softmax · Layer Normalization
