Time-weighted Frequency Domain Audio Representation with GMM Estimator   for Anomalous Sound Detection

Jian Guan; Youde Liu; Qiaoxi Zhu; Tieran Zheng; Jiqing Han; Wenwu Wang

arXiv:2305.03328·eess.AS·May 8, 2023·1 cites

Time-weighted Frequency Domain Audio Representation with GMM Estimator for Anomalous Sound Detection

Jian Guan, Youde Liu, Qiaoxi Zhu, Tieran Zheng, Jiqing Han, Wenwu Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a Time-Weighted Frequency Domain Representation combined with a GMM estimator for anomalous sound detection, offering a simpler yet effective alternative to deep learning methods, especially under domain-shift conditions.

Contribution

The paper proposes TWFR, a novel adaptive statistical frequency representation, integrated with GMM for improved anomaly detection across different machine types.

Findings

01

Outperforms recent deep learning methods on DCASE 2022 dataset

02

Achieved 3rd place in DCASE 2022 Challenge Task2

03

Demonstrates robustness under domain-shift conditions

Abstract

Although deep learning is the mainstream method in unsupervised anomalous sound detection, Gaussian Mixture Model (GMM) with statistical audio frequency representation as input can achieve comparable results with much lower model complexity and fewer parameters. Existing statistical frequency representations, e.g, the log-Mel spectrogram's average or maximum over time, do not always work well for different machines. This paper presents Time-Weighted Frequency Domain Representation (TWFR) with the GMM method (TWFR-GMM) for anomalous sound detection. The TWFR is a generalized statistical frequency domain representation that can adapt to different machine types, using the global weighted ranking pooling over time-domain. This allows GMM estimator to recognize anomalies, even under domain-shift conditions, as visualized with a Mahalanobis distance-based metric. Experiments on DCASE 2022…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liuyoude/twfr-gmm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Anomaly Detection Techniques and Applications