Voices in a Crowd: Searching for Clusters of Unique Perspectives

Nikolas Vitsakis; Amit Parekh; Ioannis Konstas

arXiv:2407.14259·cs.CL·July 22, 2024

Voices in a Crowd: Searching for Clusters of Unique Perspectives

Nikolas Vitsakis, Amit Parekh, Ioannis Konstas

PDF

Open Access 1 Video

TL;DR

This paper introduces a framework for identifying and clustering diverse perspectives in language models, effectively capturing minority opinions without relying on annotator metadata, and validated through quantitative and qualitative analyses.

Contribution

The proposed method trains models without annotator metadata, extracts behavior-informed embeddings, and clusters opinions to identify voices, including minority perspectives, demonstrating robustness and generalization.

Findings

01

Clusters effectively capture minority perspectives.

02

Framework generalizes well across datasets.

03

Clusters are validated with quantitative and qualitative metrics.

Abstract

Language models have been shown to reproduce underlying biases existing in their training data, which is the majority perspective by default. Proposed solutions aim to capture minority perspectives by either modelling annotator disagreements or grouping annotators based on shared metadata, both of which face significant challenges. We propose a framework that trains models without encoding annotator metadata, extracts latent embeddings informed by annotator behaviour, and creates clusters of similar opinions, that we refer to as voices. Resulting clusters are validated post-hoc via internal and external quantitative metrics, as well a qualitative analysis to identify the type of voice that each cluster represents. Our results demonstrate the strong generalisation capability of our framework, indicated by resulting clusters being adequately robust, while also capturing minority…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Voices in a Crowd: Searching for clusters of unique perspectives· underline

Taxonomy

TopicsOnline and Blended Learning · Educational Tools and Methods · Education and Critical Thinking Development