Towards Neural Machine Translation with Latent Tree Attention
James Bradbury, Richard Socher

TL;DR
This paper presents a neural machine translation model that learns hierarchical language structures unsupervisedly using reinforcement learning, achieving competitive performance without explicit parse annotations.
Contribution
It introduces a novel model combining a recurrent neural network grammar encoder with an attentional RNNG decoder, inducing tree structures without supervision.
Findings
Learns plausible segmentation and shallow parse structures
Achieves performance close to baseline models
Demonstrates effectiveness on character-level datasets
Abstract
Building models that take advantage of the hierarchical structure of language without a priori annotation is a longstanding goal in natural language processing. We introduce such a model for the task of machine translation, pairing a recurrent neural network grammar encoder with a novel attentional RNNG decoder and applying policy gradient reinforcement learning to induce unsupervised tree structures on both the source and target. When trained on character-level datasets with no explicit segmentation or parse annotation, the model learns a plausible segmentation and shallow parse, obtaining performance close to an attentional baseline.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
