Anticipation-Free Training for Simultaneous Machine Translation
Chih-Chiang Chang, Shun-Po Chuang, Hung-yi Lee

TL;DR
This paper introduces a novel training framework for simultaneous machine translation that separates translation and reordering, using an auxiliary sorting network to improve translation quality with lower latency.
Contribution
It proposes a new end-to-end training approach that decomposes translation and reordering, avoiding external aligners and reducing hallucinations in SimulMT.
Findings
Outperforms previous methods in translation quality
Achieves lower latency in streaming translation
Reduces hallucination phenomena during inference
Abstract
Simultaneous machine translation (SimulMT) speeds up the translation process by starting to translate before the source sentence is completely available. It is difficult due to limited context and word order difference between languages. Existing methods increase latency or introduce adaptive read-write policies for SimulMT models to handle local reordering and improve translation quality. However, the long-distance reordering would make the SimulMT models learn translation mistakenly. Specifically, the model may be forced to predict target tokens when the corresponding source tokens have not been read. This leads to aggressive anticipation during inference, resulting in the hallucination phenomenon. To mitigate this problem, we propose a new framework that decompose the translation process into the monotonic translation step and the reordering step, and we model the latter by the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
