Challenges and strategies for running controlled crowdsourcing experiments
Jorge Ram\'irez, Marcos Baez, Fabio Casati, Luca Cernuzzi, Boualem, Benatallah

TL;DR
This paper discusses the challenges of conducting controlled experiments on crowdsourcing platforms, quantifies biases' impact, and introduces CrowdHub to improve experimental control and data reliability.
Contribution
It identifies key challenges in controlled crowdsourcing experiments, quantifies their impact, and presents CrowdHub, a system to mitigate biases and improve experiment control.
Findings
Uncontrolled experiments can cause up to 38% loss in data utility.
Biases significantly alter experimental outcomes.
CrowdHub helps control experimental conditions on crowdsourcing platforms.
Abstract
This paper reports on the challenges and lessons we learned while running controlled experiments in crowdsourcing platforms. Crowdsourcing is becoming an attractive technique to engage a diverse and large pool of subjects in experimental research, allowing researchers to achieve levels of scale and completion times that would otherwise not be feasible in lab settings. However, the scale and flexibility comes at the cost of multiple and sometimes unknown sources of bias and confounding factors that arise from technical limitations of crowdsourcing platforms and from the challenges of running controlled experiments in the "wild". In this paper, we take our experience in running systematic evaluations of task design as a motivating example to explore, describe, and quantify the potential impact of running uncontrolled crowdsourcing experiments and derive possible coping strategies. Among…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
