Unsupervised Paraphrasing with Pretrained Language Models
Tong Niu, Semih Yavuz, Yingbo Zhou, Nitish Shirish Keskar, Huan Wang,, Caiming Xiong

TL;DR
This paper introduces an unsupervised method for paraphrase generation using pre-trained language models, employing task-adaptation, self-supervision, and a novel decoding algorithm called Dynamic Blocking to achieve state-of-the-art results.
Contribution
The authors propose a new unsupervised training pipeline with Dynamic Blocking, enabling high-quality paraphrasing without labeled data, and demonstrate its effectiveness across datasets and languages.
Findings
Achieves state-of-the-art performance on QQP and ParaNMT datasets.
Robust to domain shifts between datasets.
Transfers effectively to other languages without fine-tuning.
Abstract
Paraphrase generation has benefited extensively from recent progress in the designing of training objectives and model architectures. However, previous explorations have largely focused on supervised methods, which require a large amount of labeled data that is costly to collect. To address this drawback, we adopt a transfer learning approach and propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking (DB). To enforce a surface form dissimilar from the input, whenever the language model emits a token contained in the source sequence, DB prevents the model from outputting the subsequent source token for the next generation step. We show with automatic and human evaluations that our approach achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsLinear Layer · Cosine Annealing · Gated Linear Unit · Residual Connection · Inverse Square Root Schedule · Layer Normalization · Byte Pair Encoding · Softmax · Adam · Dense Connections
