Real-time outlier detection for large datasets by RT-DetMCD
Bart De Ketelaere, Mia Hubert, Jakob Raymaekers, Peter J. Rousseeuw,, Iwein Vranckx

TL;DR
This paper introduces RT-DetMCD, a real-time outlier detection method for large datasets that significantly improves speed and robustness over traditional DetMCD, enabling practical industrial data analysis.
Contribution
The paper develops a faster, parallelizable version of DetMCD with new initial estimators and a robust aggregation, suitable for real-time industrial applications.
Findings
RT-DetMCD achieves faster computation times.
The method maintains high outlier detection accuracy.
Successful application to industrial food sorting data.
Abstract
Modern industrial machines can generate gigabytes of data in seconds, frequently pushing the boundaries of available computing power. Together with the time criticality of industrial processing this presents a challenging problem for any data analytics procedure. We focus on the deterministic minimum covariance determinant method (DetMCD), which detects outliers by fitting a robust covariance matrix. We construct a much faster version of DetMCD by replacing its initial estimators by two new methods and incorporating update-based concentration steps. The computation time is reduced further by parallel computing, with a novel robust aggregation method to combine the results from the threads. The speed and accuracy of the proposed real-time DetMCD method (RT-DetMCD) are illustrated by simulation and a real industrial application to food sorting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
