What is the ground truth? Reliability of multi-annotator data for audio   tagging

Irene Martin-Morato; Annamaria Mesaros

arXiv:2104.04214·eess.AS·April 12, 2021·EUSIPCO

What is the ground truth? Reliability of multi-annotator data for audio tagging

Irene Martin-Morato, Annamaria Mesaros

PDF

Open Access

TL;DR

This paper investigates the reliability of crowdsourced audio tagging data by adapting statistical measures like Krippendorf's alpha and MACE to estimate ground truth from non-expert annotations.

Contribution

It introduces an adaptation of MACE for multi-labeled audio data and demonstrates its effectiveness in estimating ground truth from diverse non-expert annotators.

Findings

01

MACE effectively estimates annotator competence in multi-label audio tagging.

02

Krippendorf's alpha provides insights into annotation reliability.

03

The approach improves ground truth estimation from crowdsourced data.

Abstract

Crowdsourcing has become a common approach for annotating large amounts of data. It has the advantage of harnessing a large workforce to produce large amounts of data in a short time, but comes with the disadvantage of employing non-expert annotators with different backgrounds. This raises the problem of data reliability, in addition to the general question of how to combine the opinions of multiple annotators in order to estimate the ground truth. This paper presents a study of the annotations and annotators' reliability for audio tagging. We adapt the use of Krippendorf's alpha and multi-annotator competence estimation (MACE) for a multi-labeled data scenario, and present how MACE can be used to estimate a candidate ground truth based on annotations from non-expert users with different levels of expertise and competence.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Reliability and Agreement in Measurement · Survey Methodology and Nonresponse