# Mitigating Observation Biases in Crowdsourced Label Aggregation

**Authors:** Ryosuke Ueda, Koh Takeuchi, Hisashi Kashima

arXiv: 2302.13100 · 2023-02-28

## TL;DR

This paper introduces statistical methods to reduce observation biases in crowdsourced labels, improving data quality by addressing response variability, spam, and collusion.

## Contribution

It proposes novel bias removal techniques integrated with aggregation methods, enhancing accuracy and robustness in crowdsourced labeling tasks.

## Key findings

- Improved aggregation accuracy under strong observation biases
- Enhanced robustness against spam and colluding workers
- Validated effectiveness on synthetic and real datasets

## Abstract

Crowdsourcing has been widely used to efficiently obtain labeled datasets for supervised learning from large numbers of human resources at low cost. However, one of the technical challenges in obtaining high-quality results from crowdsourcing is dealing with the variability and bias caused by the fact that it is humans execute the work, and various studies have addressed this issue to improve the quality by integrating redundantly collected responses. In this study, we focus on the observation bias in crowdsourcing. Variations in the frequency of worker responses and the complexity of tasks occur, which may affect the aggregation results when they are correlated with the quality of the responses. We also propose statistical aggregation methods for crowdsourcing responses that are combined with an observational data bias removal method used in causal inference. Through experiments using both synthetic and real datasets with/without artificially injected spam and colluding workers, we verify that the proposed method improves the aggregation accuracy in the presence of strong observation biases and robustness to both spam and colluding workers.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.13100/full.md

## Figures

20 figures with captions in the complete paper: https://tomesphere.com/paper/2302.13100/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/2302.13100/full.md

---
Source: https://tomesphere.com/paper/2302.13100