Learning to Paraphrase Sentences to Different Complexity Levels

Alison Chi; Li-Kuang Chen; Yi-Chen Chang; Shu-Hui Lee; Jason S. Chang

arXiv:2308.02226·cs.CL·November 22, 2023

Learning to Paraphrase Sentences to Different Complexity Levels

Alison Chi, Li-Kuang Chen, Yi-Chen Chang, Shu-Hui Lee, Jason S. Chang

PDF

Open Access 1 Repo

TL;DR

This paper introduces unsupervised datasets for sentence simplification, complexification, and paraphrasing, demonstrating state-of-the-art results and analyzing large language models' zero-shot capabilities.

Contribution

It presents two novel unsupervised datasets labeled by different methods and evaluates their effectiveness across multiple strategies and models.

Findings

01

Models trained on weak classifier labeled data achieve state-of-the-art on ASSET.

02

Unsupervised datasets outperform previous methods in sentence simplification.

03

Large Language Models show promising zero-shot performance on these tasks.

Abstract

While sentence simplification is an active research topic in NLP, its adjacent tasks of sentence complexification and same-level paraphrasing are not. To train models on all three tasks, we present two new unsupervised datasets. We compare these datasets, one labeled by a weak classifier and the other by a rule-based approach, with a single supervised dataset. Using these three datasets for training, we perform extensive experiments on both multitasking and prompting strategies. Compared to other systems trained on unsupervised parallel data, models trained on our weak classifier labeled dataset achieve state-of-the-art performance on the ASSET simplification benchmark. Our models also outperform previous work on sentence level targeting. Finally, we establish how a handful of Large Language Models perform on these tasks under a zero-shot setting.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alisonhc/change-complexity
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText Readability and Simplification · Natural Language Processing Techniques · Topic Modeling