Understanding and Preparing Data of Industrial Processes for Machine   Learning Applications

Philipp Fleck; Manfred K\"ugel; Michael Kommenda

arXiv:2109.03469·cs.LG·September 9, 2021

Understanding and Preparing Data of Industrial Processes for Machine Learning Applications

Philipp Fleck, Manfred K\"ugel, Michael Kommenda

PDF

TL;DR

This paper introduces a novel data preprocessing technique for industrial machine learning applications that effectively handles large proportions of missing sensor data without discarding observations, demonstrated on steel production data.

Contribution

The paper presents a new method for utilizing incomplete industrial data, reducing the need for data removal when missing values are prevalent, with adaptable implementations based on data characteristics.

Findings

01

Method effectively handles large missing data proportions

02

Application demonstrated on steel production data

03

Reduces data loss compared to traditional imputation or removal

Abstract

Industrial applications of machine learning face unique challenges due to the nature of raw industry data. Preprocessing and preparing raw industrial data for machine learning applications is a demanding task that often takes more time and work than the actual modeling process itself and poses additional challenges. This paper addresses one of those challenges, specifically, the challenge of missing values due to sensor unavailability at different production units of nonlinear production lines. In cases where only a small proportion of the data is missing, those missing values can often be imputed. In cases of large proportions of missing data, imputing is often not feasible, and removing observations containing missing values is often the only option. This paper presents a technique, that allows to utilize all of the available data without the need of removing large amounts of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.