Readability-guided Idiom-aware Sentence Simplification (RISS) for Chinese
Jingshen Zhang, Xinglu Chen, Xinying Qiu, Zhimin Wang, Wenhe Feng

TL;DR
RISS is a novel framework for Chinese sentence simplification that effectively handles idioms and lacks large-scale data by combining data augmentation, paraphrase selection, and idiom-aware modeling, outperforming previous methods.
Contribution
The paper introduces RISS, a new framework that integrates data augmentation, paraphrase selection, and idiom-aware simplification with multi-stage learning for Chinese sentence simplification.
Findings
RISS outperforms previous state-of-the-art methods on two datasets.
Fine-tuning RISS on small labeled data yields further improvements.
The approach effectively simplifies Chinese sentences while preserving meaning and idiomatic expressions.
Abstract
Chinese sentence simplification faces challenges due to the lack of large-scale labeled parallel corpora and the prevalence of idioms. To address these challenges, we propose Readability-guided Idiom-aware Sentence Simplification (RISS), a novel framework that combines data augmentation techniques with lexcial simplification. RISS introduces two key components: (1) Readability-guided Paraphrase Selection (RPS), a method for mining high-quality sentence pairs, and (2) Idiom-aware Simplification (IAS), a model that enhances the comprehension and simplification of idiomatic expressions. By integrating RPS and IAS using multi-stage and multi-task learning strategies, RISS outperforms previous state-of-the-art methods on two Chinese sentence simplification datasets. Furthermore, RISS achieves additional improvements when fine-tuned on a small labeled dataset. Our approach demonstrates the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Natural Language Processing Techniques
