Deep Generic Representations for Domain-Generalized Anomalous Sound Detection
Phurich Saengthong, Takahiro Shinozaki

TL;DR
This paper introduces nRep, a domain-generalized anomalous sound detection method using large-scale pre-trained features and kNN, achieving superior performance without fine-tuning or extensive labeled data.
Contribution
It proposes nRep, combining pre-trained feature extractors, MemMixup augmentation, and domain normalization for robust, label-free anomaly detection across domains.
Findings
Outperforms OE-based methods on DCASE2023T2 with 73.79% score
Robust under limited data scenarios
No fine-tuning required for effective domain generalization
Abstract
Developing a reliable anomalous sound detection (ASD) system requires robustness to noise, adaptation to domain shifts, and effective performance with limited training data. Current leading methods rely on extensive labeled data for each target machine type to train feature extractors using Outlier-Exposure (OE) techniques, yet their performance on the target domain remains sub-optimal. In this paper, we present \textit{GenRep}, which utilizes generic feature representations from a robust, large-scale pre-trained feature extractor combined with kNN for domain-generalized ASD, without the need for fine-tuning. \textit{GenRep} incorporates MemMixup, a simple approach for augmenting the target memory bank using nearest source samples, paired with a domain normalization technique to address the imbalance between source and target domains. \textit{GenRep} outperforms the best OE-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis
MethodsSparse Evolutionary Training
