Anticipation-Free Training for Simultaneous Machine Translation

Chih-Chiang Chang; Shun-Po Chuang; Hung-yi Lee

arXiv:2201.12868·cs.CL·May 5, 2022

Anticipation-Free Training for Simultaneous Machine Translation

Chih-Chiang Chang, Shun-Po Chuang, Hung-yi Lee

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel training framework for simultaneous machine translation that separates translation and reordering, using an auxiliary sorting network to improve translation quality with lower latency.

Contribution

It proposes a new end-to-end training approach that decomposes translation and reordering, avoiding external aligners and reducing hallucinations in SimulMT.

Findings

01

Outperforms previous methods in translation quality

02

Achieves lower latency in streaming translation

03

Reduces hallucination phenomena during inference

Abstract

Simultaneous machine translation (SimulMT) speeds up the translation process by starting to translate before the source sentence is completely available. It is difficult due to limited context and word order difference between languages. Existing methods increase latency or introduce adaptive read-write policies for SimulMT models to handle local reordering and improve translation quality. However, the long-distance reordering would make the SimulMT models learn translation mistakenly. Specifically, the model may be forced to predict target tokens when the corresponding source tokens have not been read. This leads to aggressive anticipation during inference, resulting in the hallucination phenomenon. To mitigate this problem, we propose a new framework that decompose the translation process into the monotonic translation step and the reordering step, and we model the latter by the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

george0828zhang/sinkhorn-simultrans
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications