Generative latent neural models for automatic word alignment
Anh Khoa Ngo Ho (LIMSI), Fran\c{c}ois Yvon

TL;DR
This paper explores the use of variational autoencoders for automatic word alignment, demonstrating that evolved models can achieve results comparable to traditional and neural methods.
Contribution
It introduces and evaluates several improvements to vanilla variational autoencoders for word alignment, showing their competitiveness with established systems.
Findings
Evolved variational autoencoder models achieve competitive alignment results.
Models outperform Giza++ and strong neural systems on two language pairs.
Unsupervised latent representations are effective for word alignment tasks.
Abstract
Word alignments identify translational correspondences between words in a parallel sentence pair and are used, for instance, to learn bilingual dictionaries, to train statistical machine translation systems or to perform quality estimation. Variational autoencoders have been recently used in various of natural language processing to learn in an unsupervised way latent representations that are useful for language generation tasks. In this paper, we study these models for the task of word alignment and propose and assess several evolutions of a vanilla variational autoencoders. We demonstrate that these techniques can yield competitive results as compared to Giza++ and to a strong neural network alignment system for two language pairs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Authorship Attribution and Profiling
