Towards Neural Machine Translation with Latent Tree Attention

James Bradbury; Richard Socher

arXiv:1709.01915·cs.CL·September 7, 2017

Towards Neural Machine Translation with Latent Tree Attention

James Bradbury, Richard Socher

PDF

TL;DR

This paper presents a neural machine translation model that learns hierarchical language structures unsupervisedly using reinforcement learning, achieving competitive performance without explicit parse annotations.

Contribution

It introduces a novel model combining a recurrent neural network grammar encoder with an attentional RNNG decoder, inducing tree structures without supervision.

Findings

01

Learns plausible segmentation and shallow parse structures

02

Achieves performance close to baseline models

03

Demonstrates effectiveness on character-level datasets

Abstract

Building models that take advantage of the hierarchical structure of language without a priori annotation is a longstanding goal in natural language processing. We introduce such a model for the task of machine translation, pairing a recurrent neural network grammar encoder with a novel attentional RNNG decoder and applying policy gradient reinforcement learning to induce unsupervised tree structures on both the source and target. When trained on character-level datasets with no explicit segmentation or parse annotation, the model learns a plausible segmentation and shallow parse, obtaining performance close to an attentional baseline.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.