Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation
Cunxiao Du, Zhaopeng Tu, Jing Jiang

TL;DR
This paper introduces order-agnostic cross entropy (OaXE), a novel training objective for non-autoregressive translation models that addresses word reordering issues and improves translation quality.
Contribution
It proposes OaXE, a new loss function that removes order penalties and aligns predictions with targets, advancing non-autoregressive translation methods.
Findings
OaXE achieves state-of-the-art results on WMT benchmarks.
It reduces token repetitions and increases prediction confidence.
The method effectively alleviates the multimodality problem.
Abstract
We propose a new training objective named order-agnostic cross entropy (OaXE) for fully non-autoregressive translation (NAT) models. OaXE improves the standard cross-entropy loss to ameliorate the effect of word reordering, which is a common source of the critical multimodality problem in NAT. Concretely, OaXE removes the penalty for word order errors, and computes the cross entropy loss based on the best possible alignment between model predictions and target tokens. Since the log loss is very sensitive to invalid references, we leverage cross entropy initialization and loss truncation to ensure the model focuses on a good part of the search space. Extensive experiments on major WMT benchmarks show that OaXE substantially improves translation performance, setting new state of the art for fully NAT models. Further analyses show that OaXE alleviates the multimodality problem by reducing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
