Neural Paraphrase Generation with Stacked Residual LSTM Networks

Aaditya Prakash; Sadid A. Hasan; Kathy Lee; Vivek Datla; Ashequl; Qadir; Joey Liu; Oladimeji Farri

arXiv:1610.03098·cs.CL·October 14, 2016·221 cites

Neural Paraphrase Generation with Stacked Residual LSTM Networks

Aaditya Prakash, Sadid A. Hasan, Kathy Lee, Vivek Datla, Ashequl, Qadir, Joey Liu, Oladimeji Farri

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel deep learning model using stacked residual LSTM networks for paraphrase generation, outperforming existing models on multiple datasets and metrics.

Contribution

It is the first to apply residual connections in deep LSTM networks specifically for paraphrase generation, enabling more effective training and improved performance.

Findings

01

Our model outperforms sequence-to-sequence and attention-based models on BLEU, METEOR, and other metrics.

02

Residual LSTM connections facilitate training of deeper networks for better paraphrase quality.

03

Experimental results on PPDB, WikiAnswers, and MSCOCO datasets confirm the effectiveness of the proposed approach.

Abstract

In this paper, we propose a novel neural approach for paraphrase generation. Conventional para- phrase generation methods either leverage hand-written rules and thesauri-based alignments, or use statistical machine learning principles. To the best of our knowledge, this work is the first to explore deep learning models for paraphrase generation. Our primary contribution is a stacked residual LSTM network, where we add residual connections between LSTM layers. This allows for efficient training of deep LSTMs. We evaluate our model and other state-of-the-art deep learning models on three different datasets: PPDB, WikiAnswers and MSCOCO. Evaluation results demonstrate that our model outperforms sequence to sequence, attention-based and bi- directional LSTM models on BLEU, METEOR, TER and an embedding-based sentence similarity metric.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pushpendughosh/Stock-market-forecasting
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Advanced Text Analysis Techniques

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory