Encoding Source Language with Convolutional Neural Network for Machine Translation
Fandong Meng, Zhengdong Lu, Mingxuan Wang, Hang Li, Wenbin, Jiang, Qun Liu

TL;DR
This paper introduces a convolutional neural network architecture that systematically encodes source language information for machine translation, improving upon previous models by better identifying relevant source parts.
Contribution
It presents a novel convolutional and gating architecture guided by target information to enhance source encoding in neural machine translation.
Findings
Achieved up to +1.08 BLEU points improvement
Effectively identifies relevant source sentence parts
Enhances neural network joint model performance
Abstract
The recently proposed neural network joint model (NNJM) (Devlin et al., 2014) augments the n-gram target language model with a heuristically chosen source context window, achieving state-of-the-art performance in SMT. In this paper, we give a more systematic treatment by summarizing the relevant source information through a convolutional architecture guided by the target information. With different guiding signals during decoding, our specifically designed convolution+gating architectures can pinpoint the parts of a source sentence that are relevant to predicting a target word, and fuse them with the context of entire source sentence to form a unified representation. This representation, together with target language words, are fed to a deep neural network (DNN) to form a stronger NNJM. Experiments on two NIST Chinese-English translation tasks show that the proposed model can achieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
