Trainingless Adaptation of Pretrained Models for Environmental Sound   Classification

Noriyuki Tonami; Wataru Kohno; Keisuke Imoto; Yoshiyuki Yajima; Sakiko; Mishima; Reishi Kondo; Tomoyuki Hino

arXiv:2412.17212·cs.SD·December 24, 2024

Trainingless Adaptation of Pretrained Models for Environmental Sound Classification

Noriyuki Tonami, Wataru Kohno, Keisuke Imoto, Yoshiyuki Yajima, Sakiko, Mishima, Reishi Kondo, Tomoyuki Hino

PDF

Open Access

TL;DR

This paper introduces a trainingless domain adaptation method for pretrained environmental sound classification models, improving accuracy without additional training or heavy computational resources.

Contribution

It proposes a novel trainingless adaptation approach using time-frequency structure recovery and frequency filtering, reducing reliance on resource-intensive fine-tuning.

Findings

01

Achieved a 20.40% accuracy improvement on ESC-50 dataset.

02

Demonstrated effectiveness of trainingless adaptation over conventional methods.

Abstract

Deep neural network (DNN)-based models for environmental sound classification are not robust against a domain to which training data do not belong, that is, out-of-distribution or unseen data. To utilize pretrained models for the unseen domain, adaptation methods, such as finetuning and transfer learning, are used with rich computing resources, e.g., the graphical processing unit (GPU). However, it is becoming more difficult to keep up with research trends for those who have poor computing resources because state-of-the-art models are becoming computationally resource-intensive. In this paper, we propose a trainingless adaptation method for pretrained models for environmental sound classification. To introduce the trainingless adaptation method, we first propose an operation of recovering time--frequency-ish (TF-ish) structures in intermediate layers of DNN models. We then propose the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Underwater Acoustics Research