A method to benchmark high-dimensional process drift detection
Edgar Wolf, Tobias Windisch

TL;DR
This paper introduces a synthetic data generation framework and evaluation metric for benchmarking machine learning algorithms in detecting drifts in high-dimensional manufacturing process curves, revealing challenges in existing methods.
Contribution
It presents a theoretic framework for generating synthetic process curves and a new evaluation score to benchmark drift detection algorithms.
Findings
Existing algorithms struggle with multiple drift segments
The new framework enables controlled benchmarking
The temporal AUC score quantifies detection performance
Abstract
Process curves are multivariate finite time series data coming from manufacturing processes. This paper studies machine learning that detect drifts in process curve datasets. A theoretic framework to synthetically generate process curves in a controlled way is introduced in order to benchmark machine learning algorithms for process drift detection. An evaluation score, called the temporal area under the curve, is introduced, which allows to quantify how well machine learning models unveil curves belonging to drift segments. Finally, a benchmark study comparing popular machine learning approaches on synthetic data generated with the introduced framework is presented that shows that existing algorithms often struggle with datasets containing multiple drift segments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems
