Robust Feature Learning on Long-Duration Sounds for Acoustic Scene   Classification

Yuzhong Wu; Tan Lee

arXiv:2108.05008·cs.SD·August 12, 2021

Robust Feature Learning on Long-Duration Sounds for Acoustic Scene Classification

Yuzhong Wu, Tan Lee

PDF

Open Access

TL;DR

This paper proposes a robust feature learning framework for acoustic scene classification that down-weights long-duration sounds during training, improving generalization across unseen devices and locations.

Contribution

It introduces a novel RFL framework that uses an auxiliary classifier and loss function to enhance robustness of CNN-based ASC systems against domain variations.

Findings

01

Improved accuracy on unseen devices and cities.

02

Enhanced robustness of ASC classifiers.

03

Effective down-weighting of long-duration sounds during training.

Abstract

Acoustic scene classification (ASC) aims to identify the type of scene (environment) in which a given audio signal is recorded. The log-mel feature and convolutional neural network (CNN) have recently become the most popular time-frequency (TF) feature representation and classifier in ASC. An audio signal recorded in a scene may include various sounds overlapping in time and frequency. The previous study suggests that separately considering the long-duration sounds and short-duration sounds in CNN may improve ASC accuracy. This study addresses the problem of the generalization ability of acoustic scene classifiers. In practice, acoustic scene signals' characteristics may be affected by various factors, such as the choice of recording devices and the change of recording locations. When an established ASC system predicts scene classes on audios recorded in unseen scenarios, its accuracy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Diverse Musicological Studies

MethodsAuxiliary Classifier