Can Latent Alignments Improve Autoregressive Machine Translation?

Adi Haviv; Lior Vassertail; Omer Levy

arXiv:2104.09554·cs.CL·April 21, 2021

Can Latent Alignments Improve Autoregressive Machine Translation?

Adi Haviv, Lior Vassertail, Omer Levy

PDF

TL;DR

This paper investigates whether latent alignment objectives can enhance autoregressive machine translation models, finding that they often lead to degenerate models due to incompatibility with teacher forcing, supported by theoretical analysis.

Contribution

The paper demonstrates that latent alignment objectives are incompatible with teacher forcing in autoregressive models and provides a theoretical explanation for the observed degenerate solutions.

Findings

01

Latent alignment objectives cause degenerate autoregressive models.

02

Theoretical proof shows incompatibility with teacher forcing.

03

Empirical results confirm the limitations of latent alignments in this setting.

Abstract

Latent alignment objectives such as CTC and AXE significantly improve non-autoregressive machine translation models. Can they improve autoregressive models as well? We explore the possibility of training autoregressive machine translation models with latent alignment objectives, and observe that, in practice, this approach results in degenerate models. We provide a theoretical explanation for these empirical results, and prove that latent alignment objectives are incompatible with teacher forcing.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.