Beyond Majority Voting: Agreement-Based Clustering to Model Annotator Perspectives in Subjective NLP Tasks

Tadesse Destaw Belay; Ibrahim Said Ahmad; Idris Abdulmumin; Abinew Ali Ayele; Alexander Gelbukh; Eusebio Ric\'ardez-V\'azquez; Olga Kolesnikova; Shamsuddeen Hassan Muhammad; Seid Muhie Yimam

arXiv:2605.09955·cs.CL·May 12, 2026

Beyond Majority Voting: Agreement-Based Clustering to Model Annotator Perspectives in Subjective NLP Tasks

Tadesse Destaw Belay, Ibrahim Said Ahmad, Idris Abdulmumin, Abinew Ali Ayele, Alexander Gelbukh, Eusebio Ric\'ardez-V\'azquez, Olga Kolesnikova, Shamsuddeen Hassan Muhammad, Seid Muhie Yimam

PDF

TL;DR

This paper introduces an agreement-based clustering method to model annotator disagreement in subjective NLP tasks, improving label aggregation and classification performance across diverse datasets and languages.

Contribution

It presents a novel agreement-based clustering technique that effectively captures annotator perspectives, outperforming traditional majority voting and individual modeling methods.

Findings

01

Agreement-based clustering improves classification accuracy in subjective NLP tasks.

02

Multi-label and multitask approaches outperform ensemble and majority vote methods.

03

The method is effective across 40 datasets in 18 languages and three NLP tasks.

Abstract

Disagreement in annotation is a common phenomenon in the development of NLP datasets and serves as a valuable source of insight. While majority voting remains the dominant strategy for aggregating labels, recent work has explored modeling individual annotators to preserve their perspectives. However, modeling each annotator is resource-intensive and remains underexplored across various NLP tasks. We propose an agreement-based clustering technique to model the disagreement between the annotators. We conduct comprehensive experiments in 40 datasets in 18 typologically diverse languages, covering three subjective NLP tasks: sentiment analysis, emotion classification, and hate speech detection. We evaluate four aggregation approaches: majority vote, ensemble, multi-label, and multitask. The results demonstrate that agreement-based clustering can leverage the full spectrum of annotator…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.