From Consensus to Split Decisions: ABC-Stratified Sentiment in Holocaust Oral Histories
Daban Q. Jaff

TL;DR
This study evaluates the stability and agreement of sentiment classifiers on long, complex Holocaust oral histories, introducing an ABC taxonomy to analyze model divergence.
Contribution
It presents an agreement-based stability taxonomy (ABC) and a comprehensive analysis of sentiment classifier disagreement in sensitive historical narratives.
Findings
Inter-model agreement is low to moderate overall.
Disagreements are mainly around neutrality boundary decisions.
The framework helps identify where sentiment models diverge.
Abstract
Polarity detection becomes substantially more challenging under domain shift, particularly in heterogeneous, long-form narratives with complex discourse structure, such as Holocaust oral histories. This paper presents a corpus-scale diagnostic study of off-the-shelf sentiment classifiers on long-form Holocaust oral histories, using three pretrained transformer-based polarity classifiers on a corpus of 107,305 utterances and 579,013 sentences. After assembling model outputs, we introduce an agreement-based stability taxonomy (ABC) to stratify inter-model output stability. We report pairwise percent agreement, Cohen kappa, Fleiss kappa, and row-normalized confusion matrices to localize systematic disagreement. As an auxiliary descriptive signal, a T5-based emotion classifier is applied to stratified samples from each agreement stratum to compare emotion distributions across strata. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
