What is the ground truth? Reliability of multi-annotator data for audio tagging
Irene Martin-Morato, Annamaria Mesaros

TL;DR
This paper investigates the reliability of crowdsourced audio tagging data by adapting statistical measures like Krippendorf's alpha and MACE to estimate ground truth from non-expert annotations.
Contribution
It introduces an adaptation of MACE for multi-labeled audio data and demonstrates its effectiveness in estimating ground truth from diverse non-expert annotators.
Findings
MACE effectively estimates annotator competence in multi-label audio tagging.
Krippendorf's alpha provides insights into annotation reliability.
The approach improves ground truth estimation from crowdsourced data.
Abstract
Crowdsourcing has become a common approach for annotating large amounts of data. It has the advantage of harnessing a large workforce to produce large amounts of data in a short time, but comes with the disadvantage of employing non-expert annotators with different backgrounds. This raises the problem of data reliability, in addition to the general question of how to combine the opinions of multiple annotators in order to estimate the ground truth. This paper presents a study of the annotations and annotators' reliability for audio tagging. We adapt the use of Krippendorf's alpha and multi-annotator competence estimation (MACE) for a multi-labeled data scenario, and present how MACE can be used to estimate a candidate ground truth based on annotations from non-expert users with different levels of expertise and competence.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Reliability and Agreement in Measurement · Survey Methodology and Nonresponse
