Context- and Sequence-Aware Convolutional Recurrent Encoder for Neural   Machine Translation

Ritam Mallick; Seba Susan; Vaibhaw Agrawal; Rizul Garg; Prateek Rawal

arXiv:2101.04030·cs.CL·May 4, 2021

Context- and Sequence-Aware Convolutional Recurrent Encoder for Neural Machine Translation

Ritam Mallick, Seba Susan, Vaibhaw Agrawal, Rizul Garg, Prateek Rawal

PDF

TL;DR

This paper introduces a novel neural machine translation encoder that combines convolutional and recurrent layers to better capture context and sequential information, resulting in improved translation quality.

Contribution

The paper proposes a convolutional-recurrent encoder that integrates phrase-level context with sequential information, enhancing translation performance over existing models.

Findings

01

Achieved higher BLEU scores compared to state-of-the-art methods.

02

Demonstrated effective capture of phrase-level and sequential information.

03

Validated on German-English translation dataset.

Abstract

Neural Machine Translation model is a sequence-to-sequence converter based on neural networks. Existing models use recurrent neural networks to construct both the encoder and decoder modules. In alternative research, the recurrent networks were substituted by convolutional neural networks for capturing the syntactic structure in the input sentence and decreasing the processing time. We incorporate the goodness of both approaches by proposing a convolutional-recurrent encoder for capturing the context information as well as the sequential information from the source sentence. Word embedding and position embedding of the source sentence is performed prior to the convolutional encoding layer which is basically a n-gram feature extractor capturing phrase-level context information. The rectified output of the convolutional encoding layer is added to the original embedding vector, and the sum…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.