A Data-Centric Framework for Machine Listening Projects: Addressing   Large-Scale Data Acquisition and Labeling through Active Learning

Javier Naranjo-Alcazar; Jordi Grau-Haro; Ruben Ribes-Serrano; Pedro; Zuccarello

arXiv:2405.18153·cs.SD·October 10, 2024

A Data-Centric Framework for Machine Listening Projects: Addressing Large-Scale Data Acquisition and Labeling through Active Learning

Javier Naranjo-Alcazar, Jordi Grau-Haro, Ruben Ribes-Serrano, Pedro, Zuccarello

PDF

Open Access

TL;DR

This paper introduces a data-centric framework utilizing active learning for efficient large-scale audio data acquisition and labeling in machine listening, demonstrating successful application in an industrial port setting.

Contribution

It presents a comprehensive framework for resource-efficient data collection and labeling in machine listening, emphasizing active learning over crowdsourcing and detailing system configuration and optimization strategies.

Findings

01

Labeled 6540 audio samples over five months with a small team.

02

Demonstrated framework's effectiveness and adaptability in resource-constrained environments.

03

Optimized labeling budget through active learning in large-scale datasets.

Abstract

Machine Listening focuses on developing technologies to extract relevant information from audio signals. A critical aspect of these projects is the acquisition and labeling of contextualized data, which is inherently complex and requires specific resources and strategies. Despite the availability of some audio datasets, many are unsuitable for commercial applications. The paper emphasizes the importance of Active Learning (AL) using expert labelers over crowdsourcing, which often lacks detailed insights into dataset structures. AL is an iterative process combining human labelers and AI models to optimize the labeling budget by intelligently selecting samples for human review. This approach addresses the challenge of handling large, constantly growing datasets that exceed available computational resources and memory. The paper presents a comprehensive data-centric framework for Machine…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies