Outlier detection and a tail-adjusted boxplot based on extreme value theory
Shrijita Bhattacharya, Jan Beirlant

TL;DR
This paper introduces a data-driven method to identify extreme tail behavior in data, enabling more accurate outlier detection and a novel tail-adjusted boxplot based on extreme value theory.
Contribution
It extends existing tail testing methods to all max-domains of attraction and proposes a new tail-adjusted boxplot for improved outlier visualization.
Findings
Effective detection of extreme outliers in various distributions
Improved outlier visualization with the tail-adjusted boxplot
Simulation results demonstrate finite sample performance
Abstract
Whether an extreme observation is an outlier or not, depends strongly on the corresponding tail behaviour of the underlying distribution. We develop an automatic, data-driven method to identify extreme tail behaviour that deviates from the intermediate and central characteristics. This allows for detecting extreme outliers or sets of extreme data that show less spread than the bulk of the data. To this end we extend a testing method proposed in Bhattacharya et al 2019 for the specific case of heavy tailed models, to all max-domains of attraction. Consequently we propose a tail-adjusted boxplot which yields a more accurate representation of possible outliers. Several examples and simulation results illustrate the finite sample behaviour of this approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinancial Risk and Volatility Modeling · Market Dynamics and Volatility · Monetary Policy and Economic Impact
