Paraphrasing via Ranking Many Candidates

Joosung Lee

arXiv:2107.09274·cs.CL·May 10, 2022

Paraphrasing via Ranking Many Candidates

Joosung Lee

PDF

Open Access

TL;DR

This paper introduces a ranking-based method to select high-quality paraphrases from multiple candidates, improving paraphrase generation and data augmentation across different languages and domains.

Contribution

It proposes a simple, domain-agnostic approach for selecting the best paraphrase candidate from many, enhancing performance and applicability over previous methods.

Findings

01

Effective in generating diverse paraphrases

02

Improves downstream task performance with data augmentation

03

Applicable to multiple languages including English and Korean

Abstract

We present a simple and effective way to generate a variety of paraphrases and find a good quality paraphrase among them. As in previous studies, it is difficult to ensure that one generation method always generates the best paraphrase in various domains. Therefore, we focus on finding the best candidate from multiple candidates, rather than assuming that there is only one combination of generative models and decoding options. Our approach shows that it is easy to apply in various domains and has sufficiently good performance compared to previous methods. In addition, our approach can be used for data augmentation that extends the downstream corpus, showing that it can help improve performance in English and Korean datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications