An Open-set Recognition and Few-Shot Learning Dataset for Audio Event   Classification in Domestic Environments

Javier Naranjo-Alcazar; Sergi Perez-Castanos; Pedro Zuccarrello; Ana; M. Torres; Jose J. Lopez; Franscesc J. Ferri; Maximo Cobos

arXiv:2002.11561·cs.SD·April 12, 2022·1 cites

An Open-set Recognition and Few-Shot Learning Dataset for Audio Event Classification in Domestic Environments

Javier Naranjo-Alcazar, Sergi Perez-Castanos, Pedro Zuccarrello, Ana, M. Torres, Jose J. Lopez, Franscesc J. Ferri, Maximo Cobos

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new annotated dataset for open-set recognition and few-shot learning in domestic audio event classification, addressing the lack of dedicated resources and demonstrating baseline results with transfer learning.

Contribution

It provides a carefully annotated dataset for audio FSL in OSR scenarios and benchmarks baseline performance using transfer learning methods.

Findings

01

The dataset contains 1360 clips from 34 classes, including pattern and unwanted sounds.

02

Baseline transfer learning models achieve promising results on the dataset.

03

The dataset facilitates research in audio FSL and OSR in domestic environments.

Abstract

The problem of training with a small set of positive samples is known as few-shot learning (FSL). It is widely known that traditional deep learning (DL) algorithms usually show very good performance when trained with large datasets. However, in many applications, it is not possible to obtain such a high number of samples. In the image domain, typical FSL applications include those related to face recognition. In the audio domain, music fraud or speaker recognition can be clearly benefited from FSL methods. This paper deals with the application of FSL to the detection of specific and intentional acoustic events given by different types of sound alarms, such as door bells or fire alarms, using a limited number of samples. These sounds typically occur in domestic environments where many events corresponding to a wide variety of sound classes take place. Therefore, the detection of such…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

machine-listeners-valencia/fsl_osr_dataset_baseline
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Water Systems and Optimization · Anomaly Detection Techniques and Applications