Data-Driven Adaptive Simultaneous Machine Translation

Guangxu Xun; Mingbo Ma; Yuchen Bian; Xingyu Cai; Jiaji Huang; Renjie; Zheng; Junkun Chen; Jiahong Yuan; Kenneth Church; Liang Huang

arXiv:2204.12672·cs.CL·April 28, 2022

Data-Driven Adaptive Simultaneous Machine Translation

Guangxu Xun, Mingbo Ma, Yuchen Bian, Xingyu Cai, Jiaji Huang, Renjie, Zheng, Junkun Chen, Jiahong Yuan, Kenneth Church, Liang Huang

PDF

Open Access

TL;DR

This paper introduces an adaptive training scheme for simultaneous machine translation that improves over traditional fixed policies by allowing dynamic latency adjustment without increasing training complexity.

Contribution

It proposes a novel training method that augments data with adaptive prefix pairs, enabling more flexible and efficient SimulMT training.

Findings

01

Outperforms strong baselines in translation quality

02

Achieves better latency-quality trade-offs

03

Maintains training complexity similar to full-sentence models

Abstract

In simultaneous translation (SimulMT), the most widely used strategy is the wait-k policy thanks to its simplicity and effectiveness in balancing translation quality and latency. However, wait-k suffers from two major limitations: (a) it is a fixed policy that can not adaptively adjust latency given context, and (b) its training is much slower than full-sentence translation. To alleviate these issues, we propose a novel and efficient training scheme for adaptive SimulMT by augmenting the training corpus with adaptive prefix-to-prefix pairs, while the training complexity remains the same as that of training full-sentence translation models. Experiments on two language pairs show that our method outperforms all strong baselines in terms of translation quality and latency.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification