Approximate Distribution Matching for Sequence-to-Sequence Learning

Wenhu Chen; Guanlin Li; Shujie Liu; Zhirui Zhang; Mu Li; Ming Zhou

arXiv:1808.08003·cs.CL·September 5, 2018

Approximate Distribution Matching for Sequence-to-Sequence Learning

Wenhu Chen, Guanlin Li, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou

PDF

Open Access

TL;DR

This paper introduces a distribution matching framework for sequence-to-sequence learning, using local latent distribution approximation with neural networks to improve robustness and address data sparsity issues.

Contribution

It proposes a novel distribution matching approach with recurrent neural network augmenters for sequence-to-sequence tasks, enhancing data utilization and model robustness.

Findings

01

Outperforms existing algorithms in machine translation tasks

02

Improves robustness by locally augmenting data pairs

03

Reduces data sparsity issues in sequence learning

Abstract

Sequence-to-Sequence models were introduced to tackle many real-life problems like machine translation, summarization, image captioning, etc. The standard optimization algorithms are mainly based on example-to-example matching like maximum likelihood estimation, which is known to suffer from data sparsity problem. Here we present an alternate view to explain sequence-to-sequence learning as a distribution matching problem, where each source or target example is viewed to represent a local latent distribution in the source or target domain. Then, we interpret sequence-to-sequence learning as learning a transductive model to transform the source local latent distributions to match their corresponding target distributions. In our framework, we approximate both the source and target latent distributions with recurrent neural networks (augmenter). During training, the parallel augmenters…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications