IMPACT: Industrial Machine Perception via Acoustic Cognitive Transformer

Changheon Han; Yuseop Sim; Hoin Jung; Jiho Lee; Hojun Lee; Yun Seok Kang; Sucheol Woo; Garam Kim; Hyung Wook Park; Martin Byung-Guk Jun

arXiv:2507.06481·cs.SD·July 10, 2025

IMPACT: Industrial Machine Perception via Acoustic Cognitive Transformer

Changheon Han, Yuseop Sim, Hoin Jung, Jiho Lee, Hojun Lee, Yun Seok Kang, Sucheol Woo, Garam Kim, Hyung Wook Park, Martin Byung-Guk Jun

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces IMPACT, a self-supervised transformer model trained on a large industrial audio dataset, improving machine sound analysis and outperforming existing methods across diverse industrial audio tasks.

Contribution

The paper presents a new large-scale industrial audio dataset, DINOS, and a novel foundation model, IMPACT, for industrial machine sound analysis, with superior performance on multiple downstream tasks.

Findings

01

IMPACT outperforms existing models on 24 of 30 tasks

02

DINOS dataset contains over 74,000 audio samples from industrial scenarios

03

IMPACT effectively captures both global and fine-grained audio features

Abstract

Acoustic signals from industrial machines offer valuable insights for anomaly detection, predictive maintenance, and operational efficiency enhancement. However, existing task-specific, supervised learning methods often scale poorly and fail to generalize across diverse industrial scenarios, whose acoustic characteristics are distinct from general audio. Furthermore, the scarcity of accessible, large-scale datasets and pretrained models tailored for industrial audio impedes community-driven research and benchmarking. To address these challenges, we introduce DINOS (Diverse INdustrial Operation Sounds), a large-scale open-access dataset. DINOS comprises over 74,149 audio samples (exceeding 1,093 hours) collected from various industrial acoustic scenarios. We also present IMPACT (Industrial Machine Perception via Acoustic Cognitive Transformer), a novel foundation model for industrial…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 2

Strengths

The paper is unique and interesting. The paper is written well and contains detailed experiments. The self-supervised model IMPACT, trained on the proposed data, achieves the best performance across the majority of the tasks.

Weaknesses

The paper has limited novelty. The primary contribution of the paper is the dataset; the IMPACT model is based on a well-known existing self-supervised model, EAT.

Reviewer 02Rating 6Confidence 4

Strengths

- Paper is well written (except some minor grammatical errors. Authors, please recheck for missing spaces and punctuation.) - A comprehensive benchmarking setup, with distinct pretraining and downstream benchmarking sets is provided. - Limited availability of public, large-scale corpora is a major pain point in manufacturing and floor monitoring, so the dataset could indeed prove invaluable to the community. - Evaluation, to the extent done in the paper, is good.

Weaknesses

- Based on the results alone, it is hard to say how useful the proposed dataset is over the publicly available DCASE2025 Challenge Task 2 dataset for pretraining.

Reviewer 03Rating 2Confidence 4

Strengths

1. The collection of DINOS is an earnest effort. DINOS consists of the signals collected from both a microphone and a stethoscope, and covers various types of equipment. 2. The authors evaluated the performance of various off-the-shelf pretrained models on DINOS.

Weaknesses

1. The evaluation is critically insufficient and cannot show the superiority of IMPACT. The authors did not apply other pretraining methods (e.g., AudioMAE) on DINOS. They only evaluated the off-the-shelf pretrained models (e.g., a model pretrained using AudioMAE method on other acoustic datasets) on DINOS. Since IMPACT is a pretraining method, if the authors want to show the superiority of IMPACT, they need to **pretrain** IMPACT and other pretraining methods (e.g., AudioMAE) **on the same data

Code & Models

Repositories

hanprd/IMPACT
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Time Series Analysis and Forecasting · Machine Fault Diagnosis Techniques