Tracing How Annotators Think: Augmenting Preference Judgments with Reading Processes
Karin de Langis, William Walker, Khanh Chi Le, Dongyeop Kang

TL;DR
This paper introduces a novel annotation framework that captures annotators' reading behaviors to better understand decision-making and reliability in subjective NLP tasks, demonstrated through a detailed case study with mouse tracking data.
Contribution
It presents a new method for annotating not just labels but also reading processes, and provides a dataset and analysis linking reading behaviors to annotation outcomes.
Findings
Re-reading responses is common and linked to the chosen option.
Re-reading correlates with higher inter-annotator agreement.
Longer reading paths are associated with lower agreement.
Abstract
We propose an annotation approach that captures not only labels but also the reading process underlying annotators' decisions, e.g., what parts of the text they focus on, re-read or skim. Using this framework, we conduct a case study on the preference annotation task, creating a dataset PreferRead that contains fine-grained annotator reading behaviors obtained from mouse tracking. PreferRead enables detailed analysis of how annotators navigate between a prompt and two candidate responses before selecting their preference. We find that annotators re-read a response in roughly half of all trials, most often revisiting the option they ultimately choose, and rarely revisit the prompt. Reading behaviors are also significantly related to annotation outcomes: re-reading is associated with higher inter-annotator agreement, whereas long reading paths and times are associated with lower…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Topic Modeling · Expert finding and Q&A systems
