Labels have Human Values: Value Calibration of Subjective Tasks
Mohammed Fayiz Parappan, Ricardo Henao

TL;DR
This paper introduces MC-STL, a framework that clusters annotations into human value groups and calibrates NLP models accordingly, improving alignment and performance on subjective tasks involving human values.
Contribution
The paper presents a novel value calibration framework for subjective NLP tasks that leverages clustering of annotations based on human values and learns value-specific embeddings.
Findings
MC-STL outperforms baselines in discrimination and calibration metrics.
It improves handling of disagreement in subjective annotations.
Demonstrated effectiveness across multiple datasets and tasks.
Abstract
Building NLP systems for subjective tasks requires one to ensure their alignment to contrasting human values. We propose the MultiCalibrated Subjective Task Learner framework (MC-STL), which clusters annotations into identifiable human value clusters by three approaches (similarity of annotator rationales, expert-value taxonomies or rater's sociocultural descriptors) and calibrates predictions for each value cluster by learning cluster-specific embeddings. We demonstrate MC-STL on several subjective learning settings, including ordinal, binary, and preference learning predictions, and evaluate it on multiple datasets covering toxic chatbot conversations, offensive social media posts, and human preference alignment. The results show that MC-STL consistently outperforms the baselines that ignore the latent value structure of the annotations, delivering gains in discrimination,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Ethics and Social Impacts of AI · Misinformation and Its Impacts
