Anomalous Sound Detection using unsupervised and semi-supervised autoencoders and gammatone audio representation
Sergi Perez-Castanos, Javier Naranjo-Alcazar, Pedro Zuccarello and, Maximo Cobos

TL;DR
This paper introduces a novel unsupervised and semi-supervised autoencoder framework utilizing Gammatone audio representations for effective anomalous sound detection, particularly in industrial settings, outperforming existing baselines.
Contribution
It presents a new framework combining convolutional autoencoders and Gammatone features for improved unsupervised anomalous sound detection.
Findings
Significant performance improvement over baseline methods
Effective early detection of machine malfunctions
Applicable to real-world industrial scenarios
Abstract
Anomalous sound detection (ASD) is, nowadays, one of the topical subjects in machine listening discipline. Unsupervised detection is attracting a lot of interest due to its immediate applicability in many fields. For example, related to industrial processes, the early detection of malfunctions or damage in machines can mean great savings and an improvement in the efficiency of industrial processes. This problem can be solved with an unsupervised ASD solution since industrial machines will not be damaged simply by having this audio data in the training stage. This paper proposes a novel framework based on convolutional autoencoders (both unsupervised and semi-supervised) and a Gammatone-based representation of the audio. The results obtained by these architectures substantially exceed the results presented as a baseline.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Anomaly Detection Techniques and Applications
