Code-Switching with Word Senses for Pretraining in Neural Machine   Translation

Vivek Iyer; Edoardo Barba; Alexandra Birch; Jeff Z. Pan; Roberto; Navigli

arXiv:2310.14050·cs.CL·October 24, 2023·1 cites

Code-Switching with Word Senses for Pretraining in Neural Machine Translation

Vivek Iyer, Edoardo Barba, Alexandra Birch, Jeff Z. Pan, Roberto, Navigli

PDF

Open Access

TL;DR

This paper presents WSP-NMT, a novel pretraining method that incorporates word sense information from Knowledge Bases to improve multilingual neural machine translation, addressing lexical ambiguity and enhancing translation quality.

Contribution

The paper introduces an end-to-end pretraining approach that leverages word sense-specific knowledge, significantly improving translation accuracy and robustness in resource-scarce scenarios.

Findings

01

Significant improvements in translation quality.

02

Enhanced robustness across challenging datasets.

03

Better disambiguation accuracy on the DiBiMT benchmark.

Abstract

Lexical ambiguity is a significant and pervasive challenge in Neural Machine Translation (NMT), with many state-of-the-art (SOTA) NMT systems struggling to handle polysemous words (Campolungo et al., 2022). The same holds for the NMT pretraining paradigm of denoising synthetic "code-switched" text (Pan et al., 2021; Iyer et al., 2023), where word senses are ignored in the noising stage -- leading to harmful sense biases in the pretraining data that are subsequently inherited by the resulting models. In this work, we introduce Word Sense Pretraining for Neural Machine Translation (WSP-NMT) - an end-to-end approach for pretraining multilingual NMT models leveraging word sense-specific information from Knowledge Bases. Our experiments show significant improvements in overall translation quality. Then, we show the robustness of our approach to scale to various challenging data and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling