Baby Bear: Seeking a Just Right Rating Scale for Scalar Annotations

Xu Han; Felix Yu; Joao Sedoc; Benjamin Van Durme

arXiv:2408.09765·cs.LG·August 20, 2024

Baby Bear: Seeking a Just Right Rating Scale for Scalar Annotations

Xu Han, Felix Yu, Joao Sedoc, Benjamin Van Durme

PDF

Open Access

TL;DR

This paper introduces IBWS, an iterative Best-Worst Scaling method for scalar annotations, and evaluates cost-effective alternatives for large-scale data collection, demonstrating their utility in dialogue and sentiment ranking tasks.

Contribution

The paper presents IBWS, a robust iterative BWS method, and assesses direct assessment approaches for scalable, accurate scalar annotation collection.

Findings

01

IBWS produces reliable rankings with fewer annotations.

02

Certain direct assessment methods closely correlate with BWS results.

03

The annotated data improves learning-to-rank models in dialogue and sentiment analysis.

Abstract

Our goal is a mechanism for efficiently assigning scalar ratings to each of a large set of elements. For example, "what percent positive or negative is this product review?" When sample sizes are small, prior work has advocated for methods such as Best Worst Scaling (BWS) as being more robust than direct ordinal annotation ("Likert scales"). Here we first introduce IBWS, which iteratively collects annotations through Best-Worst Scaling, resulting in robustly ranked crowd-sourced data. While effective, IBWS is too expensive for large-scale tasks. Using the results of IBWS as a best-desired outcome, we evaluate various direct assessment methods to determine what is both cost-efficient and best correlating to a large scale BWS annotation strategy. Finally, we illustrate in the domains of dialogue and sentiment how these annotations can support robust learning-to-rank models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsSparse Evolutionary Training