# Revisiting Recurrent Networks for Paraphrastic Sentence Embeddings

**Authors:** John Wieting, Kevin Gimpel

arXiv: 1705.00364 · 2017-05-02

## TL;DR

This paper improves recurrent neural network approaches for paraphrastic sentence embeddings by introducing new training strategies, regularization, and a novel Gated Recurrent Averaging Network that outperforms previous models.

## Contribution

It demonstrates that with specific training and regularization, LSTMs can outperform simple averaging, and introduces a new Gated Recurrent Averaging Network architecture.

## Key findings

- LSTMs outperform averaging with proper training and regularization.
- The Gated Recurrent Averaging Network surpasses both LSTMs and averaging.
- Models show preferences for certain parts of speech and dependency relations.

## Abstract

We consider the problem of learning general-purpose, paraphrastic sentence embeddings, revisiting the setting of Wieting et al. (2016b). While they found LSTM recurrent networks to underperform word averaging, we present several developments that together produce the opposite conclusion. These include training on sentence pairs rather than phrase pairs, averaging states to represent sequences, and regularizing aggressively. These improve LSTMs in both transfer learning and supervised settings. We also introduce a new recurrent architecture, the Gated Recurrent Averaging Network, that is inspired by averaging and LSTMs while outperforming them both. We analyze our learned models, finding evidence of preferences for particular parts of speech and dependency relations.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.00364/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/1705.00364/full.md

---
Source: https://tomesphere.com/paper/1705.00364