# Truth Inference at Scale: A Bayesian Model for Adjudicating Highly   Redundant Crowd Annotations

**Authors:** Yuan Li, Benjamin I. P. Rubinstein, Trevor Cohn

arXiv: 1902.08918 · 2019-02-26

## TL;DR

This paper introduces a Bayesian model for truth inference in highly redundant crowd-sourced annotations, outperforming existing methods and the simple majority vote baseline in specific high-redundancy scenarios.

## Contribution

A novel Bayesian graphical model with conjugate priors and simple iterative inference that effectively handles high redundancy crowd-sourcing data, outperforming state-of-the-art methods.

## Key findings

- Our model significantly outperforms majority vote at one-sided level 0.025.
- It is simple, fast, and easy to implement, with only 50 lines of code.
- The approach is competitive with existing advanced truth inference techniques.

## Abstract

Crowd-sourcing is a cheap and popular means of creating training and evaluation datasets for machine learning, however it poses the problem of `truth inference', as individual workers cannot be wholly trusted to provide reliable annotations. Research into models of annotation aggregation attempts to infer a latent `true' annotation, which has been shown to improve the utility of crowd-sourced data. However, existing techniques beat simple baselines only in low redundancy settings, where the number of annotations per instance is low ($\le 3$), or in situations where workers are unreliable and produce low quality annotations (e.g., through spamming, random, or adversarial behaviours.) As we show, datasets produced by crowd-sourcing are often not of this type: the data is highly redundantly annotated ($\ge 5$ annotations per instance), and the vast majority of workers produce high quality outputs. In these settings, the majority vote heuristic performs very well, and most truth inference models underperform this simple baseline. We propose a novel technique, based on a Bayesian graphical model with conjugate priors, and simple iterative expectation-maximisation inference. Our technique produces competitive performance to the state-of-the-art benchmark methods, and is the only method that significantly outperforms the majority vote heuristic at one-sided level 0.025, shown by significance tests. Moreover, our technique is simple, is implemented in only 50 lines of code, and trains in seconds.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.08918/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1902.08918/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/1902.08918/full.md

---
Source: https://tomesphere.com/paper/1902.08918