One Size Does Not Fit All: The Case for Personalised Word Complexity   Models

Sian Gooding; Manuel Tragut

arXiv:2205.02564·cs.CL·May 6, 2022·1 cites

One Size Does Not Fit All: The Case for Personalised Word Complexity Models

Sian Gooding, Manuel Tragut

PDF

Open Access

TL;DR

This paper advocates for personalized word complexity models tailored to individual readers, demonstrating that such models outperform generic ones and providing a new dataset and active learning framework for future research.

Contribution

It introduces a novel active learning framework for creating personalized word complexity models and releases a benchmark dataset for the community.

Findings

01

Personalized models outperform generic models in predicting word difficulty.

02

Active learning effectively tailors models to individual readers.

03

A new dataset of complexity annotations is provided for further research.

Abstract

Complex Word Identification (CWI) aims to detect words within a text that a reader may find difficult to understand. It has been shown that CWI systems can improve text simplification, readability prediction and vocabulary acquisition modelling. However, the difficulty of a word is a highly idiosyncratic notion that depends on a reader's first language, proficiency and reading experience. In this paper, we show that personal models are best when predicting word complexity for individual readers. We use a novel active learning framework that allows models to be tailored to individuals and release a dataset of complexity annotations and models as a benchmark for further research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText Readability and Simplification · Natural Language Processing Techniques · Topic Modeling