General-purpose Tagging of Freesound Audio with AudioSet Labels: Task   Description, Dataset, and Baseline

Eduardo Fonseca; Manoj Plakal; Frederic Font; Daniel P. W. Ellis,; Xavier Favory; Jordi Pons; Xavier Serra

arXiv:1807.09902·cs.SD·October 9, 2018·100 cites

General-purpose Tagging of Freesound Audio with AudioSet Labels: Task Description, Dataset, and Baseline

Eduardo Fonseca, Manoj Plakal, Frederic Font, Daniel P. W. Ellis,, Xavier Favory, Jordi Pons, Xavier Serra

PDF

Open Access 3 Repos

TL;DR

This paper introduces a new audio tagging task using Freesound clips labeled with AudioSet categories, providing a dataset, task description, and baseline system to advance general-purpose audio recognition.

Contribution

It presents a comprehensive task setup, dataset, and baseline for audio tagging of Freesound content with AudioSet labels, facilitating future research.

Findings

01

Dataset and task description for Freesound audio tagging

02

Baseline system performance established

03

Encourages development of improved audio tagging models

Abstract

This paper describes Task 2 of the DCASE 2018 Challenge, titled "General-purpose audio tagging of Freesound content with AudioSet labels". This task was hosted on the Kaggle platform as "Freesound General-Purpose Audio Tagging Challenge". The goal of the task is to build an audio tagging system that can recognize the category of an audio clip from a subset of 41 diverse categories drawn from the AudioSet Ontology. We present the task, the dataset prepared for the competition, and a baseline system.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing