Strategic Data Augmentation with CTGAN for Smart Manufacturing: Enhancing Machine Learning Predictions of Paper Breaks in Pulp-and-Paper Production
Hamed Khosravi, Sarah Farhadpour, Manikanta Grandhi, Ahmed Shoyeb, Raihan, Srinjoy Das, Imtiaz Ahmed

TL;DR
This paper introduces a novel data augmentation framework using CTGAN and SMOTE to improve machine learning predictions of rare paper break events in pulp-and-paper manufacturing, significantly enhancing detection performance.
Contribution
The study presents a new data augmentation approach combining CTGAN and SMOTE to address data scarcity in rare event prediction within industrial maintenance.
Findings
Prediction accuracy for paper breaks improved by over 30% with CTGAN augmentation.
Logistic Regression detection of breaks increased by nearly 90%.
Data augmentation significantly enhances machine learning model performance in rare event scenarios.
Abstract
A significant challenge for predictive maintenance in the pulp-and-paper industry is the infrequency of paper breaks during the production process. In this article, operational data is analyzed from a paper manufacturing machine in which paper breaks are relatively rare but have a high economic impact. Utilizing a dataset comprising 18,398 instances derived from a quality assurance protocol, we address the scarcity of break events (124 cases) that pose a challenge for machine learning predictive models. With the help of Conditional Generative Adversarial Networks (CTGAN) and Synthetic Minority Oversampling Technique (SMOTE), we implement a novel data augmentation framework. This method ensures that the synthetic data mirrors the distribution of the real operational data but also seeks to enhance the performance metrics of predictive modeling. Before and after the data augmentation, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection · Manufacturing Process and Optimization · Advanced machining processes and optimization
MethodsLogistic Regression
