False Discovery Rate Control and Statistical Quality Assessment of   Annotators in Crowdsourced Ranking

Qianqian Xu; Jiechao Xiong; Xiaochun Cao; Yuan Yao

arXiv:1605.05860·stat.ML·June 17, 2016·ICML·5 cites

False Discovery Rate Control and Statistical Quality Assessment of Annotators in Crowdsourced Ranking

Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Yuan Yao

PDF

Open Access

TL;DR

This paper introduces a statistical framework to detect and control position bias among crowdsourced annotators, ensuring the reliability of labels without prior knowledge of biased annotators, supported by experiments on simulated and real data.

Contribution

The paper develops a novel statistical method using knockoff filters and Inverse Scale Space algorithms to identify biased annotators and control false discovery rate in crowdsourcing data.

Findings

01

Effective detection of position bias in simulated data

02

Successful application to real-world crowdsourcing datasets

03

Framework ensures high-quality, reliable labels in large-scale annotation tasks

Abstract

With the rapid growth of crowdsourcing platforms it has become easy and relatively inexpensive to collect a dataset labeled by multiple annotators in a short time. However due to the lack of control over the quality of the annotators, some abnormal annotators may be affected by position bias which can potentially degrade the quality of the final consensus labels. In this paper we introduce a statistical framework to model and detect annotator's position bias in order to control the false discovery rate (FDR) without a prior knowledge on the amount of biased annotators - the expected fraction of false discoveries among all discoveries being not too high, in order to assure that most of the discoveries are indeed true and replicable. The key technical development relies on some new knockoff filters adapted to our problem and new algorithms based on the Inverse Scale Space dynamics whose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Anomaly Detection Techniques and Applications · Data Stream Mining Techniques