Benchmarking deep learning models for bearing fault diagnosis using the CWRU dataset: A multi-label approach
Rodrigo Kobashikawa Rosa, Danilo Braga, Danilo Silva

TL;DR
This paper introduces a multi-label fault diagnosis approach for the CWRU bearing dataset, addressing data leakage issues, class imbalance, and realism, and benchmarks several deep learning models on this improved setup.
Contribution
It proposes a multi-label formulation and a realistic dataset division to improve fault diagnosis accuracy and evaluation, along with a comprehensive benchmark of deep learning models.
Findings
Multi-label approach reduces data leakage effects.
New dataset division improves model performance.
Benchmarking reveals strengths and weaknesses of models.
Abstract
This paper proposes a novel approach for modeling the problem of fault diagnosis using the Case Western Reserve University (CWRU) bearing fault dataset. Although the dataset is considered a standard reference for testing new algorithms, the typical dataset division suffers from data leakage, as shown by Hendriks et al. (2022) and Abburi et al. (2023), leading to papers reporting over-optimistic results. While their proposed division significantly mitigates this issue, it does not eliminate it entirely. Moreover, their proposed multi-class classification task can still lead to an unrealistic scenario by excluding the possibility of more than one fault type occurring at the same or different locations. As advocated in this paper, a multi-label formulation (detecting the presence of each type of fault for each location) can solve both issues, leading to a scenario closer to reality.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Fault Diagnosis Techniques
