Big Machinery Data Preprocessing Methodology for Data-Driven Models in Prognostics and Health Management
Sergio Cofre-Martel, Enrique Lopez Droguett, Mohammad Modarres

TL;DR
This paper introduces a comprehensive data preprocessing pipeline for machinery health monitoring data, emphasizing expert knowledge and validated through case studies, to improve data-driven prognostics models.
Contribution
It provides a formal, step-by-step data preprocessing methodology specifically designed for real-world PHM applications, addressing a gap in existing research.
Findings
Validated preprocessing pipeline through two case studies
Produced clean datasets with accurate health labels
Enhanced data quality for training prognostics models
Abstract
Sensor monitoring networks and advances in big data analytics have guided the reliability engineering landscape to a new era of big machinery data. Low-cost sensors, along with the evolution of the internet of things and industry 4.0, have resulted in rich databases that can be analyzed through prognostics and health management (PHM) frameworks. Several da-ta-driven models (DDMs) have been proposed and applied for diagnostics and prognostics purposes in complex systems. However, many of these models are developed using simulated or experimental data sets, and there is still a knowledge gap for applications in real operating systems. Furthermore, little attention has been given to the required data preprocessing steps compared to the training processes of these DDMs. Up to date, research works do not follow a formal and consistent data preprocessing guideline for PHM applications. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
