Hierarchical Metadata Information Constrained Self-Supervised Learning   for Anomalous Sound Detection Under Domain Shift

Haiyan Lan; Qiaoxi Zhu; Jian Guan; Yuming Wei; Wenwu Wang

arXiv:2309.07498·eess.AS·December 19, 2023

Hierarchical Metadata Information Constrained Self-Supervised Learning for Anomalous Sound Detection Under Domain Shift

Haiyan Lan, Qiaoxi Zhu, Jian Guan, Yuming Wei, Wenwu Wang

PDF

Open Access

TL;DR

This paper introduces a hierarchical metadata constrained self-supervised learning approach for anomalous sound detection under domain shift, leveraging hierarchical relations between section IDs and attributes to improve feature representation and detection accuracy.

Contribution

It proposes a novel hierarchical metadata constraint framework and an attribute-group-center scoring method, enhancing ASD performance under domain shifts.

Findings

01

Outperforms state-of-the-art methods in DCASE 2022 challenge

02

Improves feature representation by using hierarchical metadata constraints

03

Demonstrates robustness under various domain shift conditions

Abstract

Self-supervised learning methods have achieved promising performance for anomalous sound detection (ASD) under domain shift, where the type of domain shift is considered in feature learning by incorporating section IDs. However, the attributes accompanying audio files under each section, such as machine operating conditions and noise types, have not been considered, although they are also crucial for characterizing domain shifts. In this paper, we present a hierarchical metadata information constrained self-supervised (HMIC) ASD method, where the hierarchical relation between section IDs and attributes is constructed, and used as constraints to obtain finer feature representation. In addition, we propose an attribute-group-center (AGC)-based method for calculating the anomaly score under the domain shift condition. Experiments are performed to demonstrate its improved performance over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis