Neural Paraphrase Generation with Stacked Residual LSTM Networks
Aaditya Prakash, Sadid A. Hasan, Kathy Lee, Vivek Datla, Ashequl, Qadir, Joey Liu, Oladimeji Farri

TL;DR
This paper introduces a novel deep learning model using stacked residual LSTM networks for paraphrase generation, outperforming existing models on multiple datasets and metrics.
Contribution
It is the first to apply residual connections in deep LSTM networks specifically for paraphrase generation, enabling more effective training and improved performance.
Findings
Our model outperforms sequence-to-sequence and attention-based models on BLEU, METEOR, and other metrics.
Residual LSTM connections facilitate training of deeper networks for better paraphrase quality.
Experimental results on PPDB, WikiAnswers, and MSCOCO datasets confirm the effectiveness of the proposed approach.
Abstract
In this paper, we propose a novel neural approach for paraphrase generation. Conventional para- phrase generation methods either leverage hand-written rules and thesauri-based alignments, or use statistical machine learning principles. To the best of our knowledge, this work is the first to explore deep learning models for paraphrase generation. Our primary contribution is a stacked residual LSTM network, where we add residual connections between LSTM layers. This allows for efficient training of deep LSTMs. We evaluate our model and other state-of-the-art deep learning models on three different datasets: PPDB, WikiAnswers and MSCOCO. Evaluation results demonstrate that our model outperforms sequence to sequence, attention-based and bi- directional LSTM models on BLEU, METEOR, TER and an embedding-based sentence similarity metric.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Advanced Text Analysis Techniques
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
