Unsupervised Neural Text Simplification
Sai Surya, Abhijit Mishra, Anirban Laha, Parag Jain, Karthik, Sankaranarayanan

TL;DR
This paper introduces an unsupervised neural model for text simplification that learns from unlabeled data, achieving competitive results with some improvements when minimal labeled data is added.
Contribution
It presents the first unsupervised neural approach to text simplification using only unlabeled corpora, combining shared encoders and attentional decoders with discrimination and denoising techniques.
Findings
Model performs well at lexical and syntactic simplification
Competitive with supervised methods on public benchmarks
Adding a few labeled pairs enhances performance
Abstract
The paper presents a first attempt towards unsupervised neural text simplification that relies only on unlabeled text corpora. The core framework is composed of a shared encoder and a pair of attentional-decoders and gains knowledge of simplification through discrimination based-losses and denoising. The framework is trained using unlabeled text collected from en-Wikipedia dump. Our analysis (both quantitative and qualitative involving human evaluators) on a public test data shows that the proposed model can perform text-simplification at both lexical and syntactic levels, competitive to existing supervised methods. Addition of a few labelled pairs also improves the performance further.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Natural Language Processing Techniques · Topic Modeling
