Fuzzy Alignments in Directed Acyclic Graph for Non-Autoregressive Machine Translation
Zhengrui Ma, Chenze Shao, Shangtong Gui, Min Zhang, Yang Feng

TL;DR
This paper introduces a fuzzy alignment approach for directed acyclic graph structures in non-autoregressive machine translation, improving handling of multiple translation modalities and achieving state-of-the-art results.
Contribution
It proposes a novel fuzzy alignment training method that relaxes strict token-vertex alignment, enhancing NAT performance on multiple translation modalities.
Findings
Significant improvement in translation quality on WMT benchmarks.
Increased prediction confidence in NAT models.
Sets new state-of-the-art results on raw training data.
Abstract
Non-autoregressive translation (NAT) reduces the decoding latency but suffers from performance degradation due to the multi-modality problem. Recently, the structure of directed acyclic graph has achieved great success in NAT, which tackles the multi-modality problem by introducing dependency between vertices. However, training it with negative log-likelihood loss implicitly requires a strict alignment between reference tokens and vertices, weakening its ability to handle multiple translation modalities. In this paper, we hold the view that all paths in the graph are fuzzily aligned with the reference sentence. We do not require the exact alignment but train the model to maximize a fuzzy alignment score between the graph and reference, which takes captured translations in all modalities into account. Extensive experiments on major WMT benchmarks show that our method substantially…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Translation Studies and Practices
