What Have We Achieved on Non-autoregressive Translation?
Yafu Li, Huajian Zhang, Jianhao Yan, Yongjing Yin, Yue Zhang

TL;DR
This paper systematically evaluates non-autoregressive translation methods, revealing that despite improvements, they still lag behind autoregressive models in human-aligned metrics and highlighting the importance of modeling dependencies.
Contribution
It provides a comprehensive comparison of NAT and AT, emphasizing the significance of dependency modeling and using human evaluation for more accurate assessment.
Findings
NAT methods still underperform compared to AT on reliable metrics.
Explicit dependency modeling improves NAT performance and generalization.
Human evaluation reveals gaps not captured by BLEU scores.
Abstract
Recent advances have made non-autoregressive (NAT) translation comparable to autoregressive methods (AT). However, their evaluation using BLEU has been shown to weakly correlate with human annotations. Limited research compares non-autoregressive translation and autoregressive translation comprehensively, leaving uncertainty about the true proximity of NAT to AT. To address this gap, we systematically evaluate four representative NAT methods across various dimensions, including human evaluation. Our empirical results demonstrate that despite narrowing the performance gap, state-of-the-art NAT still underperforms AT under more reliable evaluation metrics. Furthermore, we discover that explicitly modeling dependencies is crucial for generating natural language and generalizing to out-of-distribution sequences.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques
