Inferring the ground truth through crowdsourcing
Jean Pierre Char

TL;DR
This paper discusses methods for inferring reliable ground truth data from crowdsourced annotations and autonomous agents, especially when true labels are difficult or costly to obtain, emphasizing verification and aggregation techniques.
Contribution
It introduces approaches for inferring and verifying ground truth from crowdsourcing and autonomous agents, addressing challenges in sensitive and complex annotation tasks.
Findings
Effective aggregation improves label accuracy
Verification processes enhance data reliability
Applicable to sensitive domains like medical imaging
Abstract
Universally valid ground truth is almost impossible to obtain or would come at a very high cost. For supervised learning without universally valid ground truth, a recommended approach is applying crowdsourcing: Gathering a large data set annotated by multiple individuals of varying possibly expertise levels and inferring the ground truth data to be used as labels to train the classifier. Nevertheless, due to the sensitivity of the problem at hand (e.g. mitosis detection in breast cancer histology images), the obtained data needs verification and proper assessment before being used for classifier training. Even in the context of organic computing systems, an indisputable ground truth might not always exist. Therefore, it should be inferred through the aggregation and verification of the local knowledge of each autonomous agent.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Data Stream Mining Techniques · Privacy-Preserving Technologies in Data
