One Size Does Not Fit All: The Case for Personalised Word Complexity Models
Sian Gooding, Manuel Tragut

TL;DR
This paper advocates for personalized word complexity models tailored to individual readers, demonstrating that such models outperform generic ones and providing a new dataset and active learning framework for future research.
Contribution
It introduces a novel active learning framework for creating personalized word complexity models and releases a benchmark dataset for the community.
Findings
Personalized models outperform generic models in predicting word difficulty.
Active learning effectively tailors models to individual readers.
A new dataset of complexity annotations is provided for further research.
Abstract
Complex Word Identification (CWI) aims to detect words within a text that a reader may find difficult to understand. It has been shown that CWI systems can improve text simplification, readability prediction and vocabulary acquisition modelling. However, the difficulty of a word is a highly idiosyncratic notion that depends on a reader's first language, proficiency and reading experience. In this paper, we show that personal models are best when predicting word complexity for individual readers. We use a novel active learning framework that allows models to be tailored to individuals and release a dataset of complexity annotations and models as a benchmark for further research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Natural Language Processing Techniques · Topic Modeling
