TL;DR
This paper introduces the SuspectGuilt Corpus, a dataset of annotated crime stories that reveals how language influences readers' subjective guilt assessments, and develops models to predict these judgments.
Contribution
It provides a novel annotated corpus and predictive models that analyze the impact of linguistic choices on guilt perception in crime narratives.
Findings
Models benefit from genre pretraining and joint supervision.
Annotations reveal linguistic features influencing guilt judgments.
Corpus enables understanding societal effects of crime reporting.
Abstract
Crime reporting is a prevalent form of journalism with the power to shape public perceptions and social policies. How does the language of these reports act on readers? We seek to address this question with the SuspectGuilt Corpus of annotated crime stories from English-language newspapers in the U.S. For SuspectGuilt, annotators read short crime articles and provided text-level ratings concerning the guilt of the main suspect as well as span-level annotations indicating which parts of the story they felt most influenced their ratings. SuspectGuilt thus provides a rich picture of how linguistic choices affect subjective guilt judgments. In addition, we use SuspectGuilt to train and assess predictive models, and show that these models benefit from genre pretraining and joint supervision from the text-level ratings and span-level annotations. Such models might be used as tools for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
