Deep Generic Representations for Domain-Generalized Anomalous Sound   Detection

Phurich Saengthong; Takahiro Shinozaki

arXiv:2409.05035·cs.SD·September 10, 2024

Deep Generic Representations for Domain-Generalized Anomalous Sound Detection

Phurich Saengthong, Takahiro Shinozaki

PDF

Open Access 1 Repo

TL;DR

This paper introduces nRep, a domain-generalized anomalous sound detection method using large-scale pre-trained features and kNN, achieving superior performance without fine-tuning or extensive labeled data.

Contribution

It proposes nRep, combining pre-trained feature extractors, MemMixup augmentation, and domain normalization for robust, label-free anomaly detection across domains.

Findings

01

Outperforms OE-based methods on DCASE2023T2 with 73.79% score

02

Robust under limited data scenarios

03

No fine-tuning required for effective domain generalization

Abstract

Developing a reliable anomalous sound detection (ASD) system requires robustness to noise, adaptation to domain shifts, and effective performance with limited training data. Current leading methods rely on extensive labeled data for each target machine type to train feature extractors using Outlier-Exposure (OE) techniques, yet their performance on the target domain remains sub-optimal. In this paper, we present \textit{GenRep}, which utilizes generic feature representations from a robust, large-scale pre-trained feature extractor combined with kNN for domain-generalized ASD, without the need for fine-tuning. \textit{GenRep} incorporates MemMixup, a simple approach for augmenting the target memory bank using nearest source samples, paired with a domain normalization technique to address the imbalance between source and target domains. \textit{GenRep} outperforms the best OE-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

phuriches/genrepasd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis

MethodsSparse Evolutionary Training