Approximate Distribution Matching for Sequence-to-Sequence Learning
Wenhu Chen, Guanlin Li, Shujie Liu, Zhirui Zhang, Mu Li, Ming Zhou

TL;DR
This paper introduces a distribution matching framework for sequence-to-sequence learning, using local latent distribution approximation with neural networks to improve robustness and address data sparsity issues.
Contribution
It proposes a novel distribution matching approach with recurrent neural network augmenters for sequence-to-sequence tasks, enhancing data utilization and model robustness.
Findings
Outperforms existing algorithms in machine translation tasks
Improves robustness by locally augmenting data pairs
Reduces data sparsity issues in sequence learning
Abstract
Sequence-to-Sequence models were introduced to tackle many real-life problems like machine translation, summarization, image captioning, etc. The standard optimization algorithms are mainly based on example-to-example matching like maximum likelihood estimation, which is known to suffer from data sparsity problem. Here we present an alternate view to explain sequence-to-sequence learning as a distribution matching problem, where each source or target example is viewed to represent a local latent distribution in the source or target domain. Then, we interpret sequence-to-sequence learning as learning a transductive model to transform the source local latent distributions to match their corresponding target distributions. In our framework, we approximate both the source and target latent distributions with recurrent neural networks (augmenter). During training, the parallel augmenters…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
