Generative Pretraining for Paraphrase Evaluation

Jack Weston; Raphael Lenain; Udeepa Meepegama; Emil Fristed

arXiv:2107.08251·cs.CL·July 27, 2021

Generative Pretraining for Paraphrase Evaluation

Jack Weston, Raphael Lenain, Udeepa Meepegama, Emil Fristed

PDF

Open Access

TL;DR

ParaBLEU is a novel paraphrase evaluation metric that uses generative pretraining to better align with human judgments, outperforming existing metrics and demonstrating robustness with limited data.

Contribution

It introduces ParaBLEU, a new model that learns paraphrase representations through generative pretraining, achieving state-of-the-art results and enabling conditional paraphrase generation.

Findings

01

ParaBLEU correlates more strongly with human judgments than existing metrics.

02

It exceeds previous state-of-the-art performance with only 50% of training data.

03

ParaBLEU can generate novel paraphrases from minimal examples.

Abstract

We introduce ParaBLEU, a paraphrase representation learning model and evaluation metric for text generation. Unlike previous approaches, ParaBLEU learns to understand paraphrasis using generative conditioning as a pretraining objective. ParaBLEU correlates more strongly with human judgements than existing metrics, obtaining new state-of-the-art results on the 2017 WMT Metrics Shared Task. We show that our model is robust to data scarcity, exceeding previous state-of-the-art performance using only $50%$ of the available training data and surpassing BLEU, ROUGE and METEOR with only $40$ labelled examples. Finally, we demonstrate that ParaBLEU can be used to conditionally generate novel paraphrases from a single demonstration, which we use to confirm our hypothesis that it learns abstract, generalized paraphrase representations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques