Latent Part-of-Speech Sequences for Neural Machine Translation

Xuewen Yang; Yingru Liu; Dongliang Xie; Xin Wang; and Niranjan; Balasubramanian

arXiv:1908.11782·cs.AI·September 2, 2019

Latent Part-of-Speech Sequences for Neural Machine Translation

Xuewen Yang, Yingru Liu, Dongliang Xie, Xin Wang, and Niranjan, Balasubramanian

PDF

Open Access

TL;DR

This paper introduces LaSyn, a latent variable model for neural machine translation that effectively incorporates syntactic structure, improving translation quality and diversity while enabling efficient inference through a novel decoupling approach.

Contribution

LaSyn is a new latent variable model that captures syntax-semantics co-dependence and allows exhaustive search over syntactic choices in NMT.

Findings

01

Improves translation quality across four MT tasks.

02

Enhances diversity in generated translations.

03

Maintains decoding speed proportional to latent vocabulary size.

Abstract

Learning target side syntactic structure has been shown to improve Neural Machine Translation (NMT). However, incorporating syntax through latent variables introduces additional complexity in inference, as the models need to marginalize over the latent syntactic structures. To avoid this, models often resort to greedy search which only allows them to explore a limited portion of the latent space. In this work, we introduce a new latent variable model, LaSyn, that captures the co-dependence between syntax and semantics, while allowing for effective and efficient inference over the latent space. LaSyn decouples direct dependence between successive latent variables, which allows its decoder to exhaustively search through the latent syntactic choices, while keeping decoding speed proportional to the size of the latent variable vocabulary. We implement LaSyn by modifying a transformer-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings