A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification
Mounica Maddela, Wei Xu

TL;DR
This paper introduces a human-rated word complexity lexicon and a neural readability ranking model that outperforms existing systems in lexical simplification, also producing a large paraphrase resource.
Contribution
The paper presents a new human-rated lexicon and a neural model with Gaussian features for improved lexical simplification and paraphrase generation.
Findings
Model outperforms state-of-the-art systems
Created a lexicon of 15,000 words with human ratings
Produced SimplePPDB++, a large paraphrase resource
Abstract
Current lexical simplification approaches rely heavily on heuristics and corpus level features that do not always align with human judgment. We create a human-rated word-complexity lexicon of 15,000 English words and propose a novel neural readability ranking model with a Gaussian-based feature vectorization layer that utilizes these human ratings to measure the complexity of any given word or phrase. Our model performs better than the state-of-the-art systems for different lexical simplification tasks and evaluation datasets. Additionally, we also produce SimplePPDB++, a lexical resource of over 10 million simplifying paraphrase rules, by applying our model to the Paraphrase Database (PPDB).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Natural Language Processing Techniques · Topic Modeling
