Comparison of Outlier Detection Techniques for Structured Data
Amulya Agarwal, Nitin Gupta

TL;DR
This paper compares various outlier detection techniques for structured data to assist data scientists in selecting appropriate algorithms for improved machine learning model performance.
Contribution
It provides a comparative analysis of existing outlier detection methods, highlighting their strengths and use cases for structured data.
Findings
Certain techniques outperform others in specific data scenarios
Outlier detection significantly improves model accuracy when outliers are removed
The paper offers guidance for choosing suitable outlier detection methods
Abstract
An outlier is an observation or a data point that is far from rest of the data points in a given dataset or we can be said that an outlier is away from the center of mass of observations. Presence of outliers can skew statistical measures and data distributions which can lead to misleading representation of the underlying data and relationships. It is seen that the removal of outliers from the training dataset before modeling can give better predictions. With the advancement of machine learning, the outlier detection models are also advancing at a good pace. The goal of this work is to highlight and compare some of the existing outlier detection techniques for the data scientists to use that information for outlier algorithm selection while building a machine learning model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Advanced Statistical Methods and Models
