Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning   of Pre-Trained Audio Models

Xinhu Zheng; Anbai Jiang; Bing Han; Yanmin Qian; Pingyi Fan; Jia Liu,; Wei-Qiang Zhang

arXiv:2409.07016·cs.SD·May 8, 2025

Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models

Xinhu Zheng, Anbai Jiang, Bing Han, Yanmin Qian, Pingyi Fan, Jia Liu,, Wei-Qiang Zhang

PDF

Open Access

TL;DR

This paper enhances anomalous sound detection by fine-tuning pre-trained audio models with Low-Rank Adaptation, achieving state-of-the-art results and addressing data scarcity issues in industrial environments.

Contribution

It introduces the use of LoRA tuning for pre-trained audio models in ASD, significantly improving performance over previous methods.

Findings

01

Achieved 77.75% on DCASE2023 Task 2 dataset

02

Outperformed previous SOTA models by 6.48%

03

Validated effectiveness of LoRA tuning through ablation studies

Abstract

Anomalous Sound Detection (ASD) has gained significant interest through the application of various Artificial Intelligence (AI) technologies in industrial settings. Though possessing great potential, ASD systems can hardly be readily deployed in real production sites due to the generalization problem, which is primarily caused by the difficulty of data collection and the complexity of environmental factors. This paper introduces a robust ASD model that leverages audio pre-trained models. Specifically, we fine-tune these models using machine operation data, employing SpecAug as a data augmentation strategy. Additionally, we investigate the impact of utilizing Low-Rank Adaptation (LoRA) tuning instead of full fine-tuning to address the problem of limited data for fine-tuning. Our experiments on the DCASE2023 Task 2 dataset establish a new benchmark of 77.75% on the evaluation set, with a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis