# Neural Machine Translation with Recurrent Highway Networks

**Authors:** Maulik Parmar, V.Susheela Devi

arXiv: 1905.01996 · 2019-05-07

## TL;DR

This paper explores the use of Recurrent Highway Networks in neural machine translation, demonstrating comparable or superior performance to LSTM models and highlighting their ease of training with increased depth.

## Contribution

It introduces Recurrent Highway Networks for NMT, showing their effectiveness and easier training compared to traditional LSTM-based models, especially with increased depth.

## Key findings

- RHN performs on par or better than LSTM in NMT tasks
- Deep RHN models are easier to train than deep LSTM models
- Increasing recurrent depth impacts model performance and training dynamics

## Abstract

Recurrent Neural Networks have lately gained a lot of popularity in language modelling tasks, especially in neural machine translation(NMT). Very recent NMT models are based on Encoder-Decoder, where a deep LSTM based encoder is used to project the source sentence to a fixed dimensional vector and then another deep LSTM decodes the target sentence from the vector. However there has been very little work on exploring architectures that have more than one layer in space(i.e. in each time step). This paper examines the effectiveness of the simple Recurrent Highway Networks(RHN) in NMT tasks. The model uses Recurrent Highway Neural Network in encoder and decoder, with attention .We also explore the reconstructor model to improve adequacy. We demonstrate the effectiveness of all three approaches on the IWSLT English-Vietnamese dataset. We see that RHN performs on par with LSTM based models and even better in some cases.We see that deep RHN models are easy to train compared to deep LSTM based models because of highway connections. The paper also investigates the effects of increasing recurrent depth in each time step.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.01996/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1905.01996/full.md

## References

11 references — full list in the complete paper: https://tomesphere.com/paper/1905.01996/full.md

---
Source: https://tomesphere.com/paper/1905.01996