Text Difficulty Study: Do machines behave the same as humans regarding   text difficulty?

Bowen Chen; Xiao Ding; Li Du; Qin Bing; Ting Liu

arXiv:2208.14509·cs.CL·April 3, 2024

Text Difficulty Study: Do machines behave the same as humans regarding text difficulty?

Bowen Chen, Xiao Ding, Li Du, Qin Bing, Ting Liu

PDF

Open Access

TL;DR

This study investigates whether NLP models learn text difficulty similarly to humans, introducing the HLM Index and comparing learning behaviors of models like LSTM and BERT across various tasks.

Contribution

The paper proposes the HLM Index to measure text difficulty and compares human-like learning patterns of different NLP models, revealing insights into training strategies and difficulty criteria.

Findings

01

LSTM exhibits more human-like learning behavior than BERT.

02

UID-SuperLinear is the most effective text difficulty criterion.

03

Training from easy to hard data leads to faster convergence.

Abstract

Given a task, human learns from easy to hard, whereas the model learns randomly. Undeniably, difficulty insensitive learning leads to great success in NLP, but little attention has been paid to the effect of text difficulty in NLP. In this research, we propose the Human Learning Matching Index (HLM Index) to investigate the effect of text difficulty. Experiment results show: (1) LSTM has more human-like learning behavior than BERT. (2) UID-SuperLinear gives the best evaluation of text difficulty among four text difficulty criteria. (3) Among nine tasks, some tasks' performance is related to text difficulty, whereas some are not. (4) Model trained on easy data performs best in easy and medium data, whereas trains on a hard level only perform well on hard data. (5) Training the model from easy to hard leads to fast convergence.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText Readability and Simplification · Intelligent Tutoring Systems and Adaptive Learning

MethodsAttention Is All You Need · Linear Layer · WordPiece · Layer Normalization · Softmax · Linear Warmup With Linear Decay · Adam · Tanh Activation · Multi-Head Attention · Weight Decay