Rethinking Crowd Sourcing for Semantic Similarity

Shaul Solomon; Adam Cohn; Hernan Rosenblum; Chezi Hershkovitz; and Ivan P. Yamshchikov

arXiv:2109.11969·cs.CL·September 27, 2021

Rethinking Crowd Sourcing for Semantic Similarity

Shaul Solomon, Adam Cohn, Hernan Rosenblum, Chezi Hershkovitz, and Ivan P. Yamshchikov

PDF

Open Access

TL;DR

This paper examines the ambiguities in crowd-sourced semantic similarity labeling, emphasizing the impact of binary annotator perceptions and proposing heuristics to improve label reliability in NLP tasks.

Contribution

It identifies the dominant role of binary annotators in semantic similarity labeling and introduces heuristics to filter unreliable annotators, enhancing label quality.

Findings

01

Binary annotators significantly influence crowd-sourced labels.

02

Heuristics can effectively filter unreliable annotators.

03

Discussion on human perception of semantic similarity.

Abstract

Estimation of semantic similarity is crucial for a variety of natural language processing (NLP) tasks. In the absence of a general theory of semantic information, many papers rely on human annotators as the source of ground truth for semantic similarity estimation. This paper investigates the ambiguities inherent in crowd-sourced semantic labeling. It shows that annotators that treat semantic similarity as a binary category (two sentences are either similar or not similar and there is no middle ground) play the most important role in the labeling. The paper offers heuristics to filter out unreliable annotators and stimulates further discussions on human perception of semantic similarity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Sentiment Analysis and Opinion Mining · Opinion Dynamics and Social Influence