Multi-scale Scanning Network for Machine Anomalous Sound Detection

Yucong Zhang; Juan Liu; Ming Li

arXiv:2508.17194·cs.SD·August 26, 2025

Multi-scale Scanning Network for Machine Anomalous Sound Detection

Yucong Zhang, Juan Liu, Ming Li

PDF

TL;DR

This paper introduces a Multi-scale Scanning Network (MSN) that captures multi-scale patterns in machine sounds for improved Anomalous Sound Detection, achieving state-of-the-art results on standard datasets.

Contribution

The paper proposes a novel MSN architecture that scans spectrograms at multiple scales using kernel boxes and shared-weight convolutional networks for enhanced ASD performance.

Findings

01

MSN outperforms existing methods on DCASE datasets.

02

Multi-scale analysis improves detection accuracy.

03

Efficient architecture with shared weights reduces complexity.

Abstract

Machine sounds exhibit consistent and repetitive patterns in both the frequency and time domains, which vary significantly across scales for different machine types. For instance, rotating machines often show periodic features in short time intervals, while reciprocating machines exhibit broader patterns spanning the time domain. While prior studies have leveraged these patterns to improve Anomalous Sound Detection (ASD), the variation of patterns across scales remains insufficiently explored. To address this gap, we introduce a Multi-scale Scanning Network (MSN) designed to capture patterns at multiple scales. MSN employs kernel boxes of varying sizes to scan audio spectrograms and integrates a lightweight convolutional network with shared weights for efficient and scalable feature representation. Experimental evaluations on the DCASE 2020 and DCASE 2023 Task 2 datasets demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.