Paraphrasing via Ranking Many Candidates
Joosung Lee

TL;DR
This paper introduces a ranking-based method to select high-quality paraphrases from multiple candidates, improving paraphrase generation and data augmentation across different languages and domains.
Contribution
It proposes a simple, domain-agnostic approach for selecting the best paraphrase candidate from many, enhancing performance and applicability over previous methods.
Findings
Effective in generating diverse paraphrases
Improves downstream task performance with data augmentation
Applicable to multiple languages including English and Korean
Abstract
We present a simple and effective way to generate a variety of paraphrases and find a good quality paraphrase among them. As in previous studies, it is difficult to ensure that one generation method always generates the best paraphrase in various domains. Therefore, we focus on finding the best candidate from multiple candidates, rather than assuming that there is only one combination of generative models and decoding options. Our approach shows that it is easy to apply in various domains and has sufficiently good performance compared to previous methods. In addition, our approach can be used for data augmentation that extends the downstream corpus, showing that it can help improve performance in English and Korean datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
