ggskewboxplots: Enhanced Boxplots for Skewed Data in R
Mustafa Cavus

TL;DR
This paper introduces ggskewboxplots, an R package offering robust, skewness-aware boxplot variants that improve data visualization accuracy for skewed distributions, supported by extensive simulation analysis.
Contribution
It presents a unified framework and new visualization tools for better boxplot representations of skewed data, addressing limitations of classical methods.
Findings
Classical boxplots are prone to swamping and masking in skewed data.
Skewness-adjusted boxplot variants outperform classical boxplots in sensitivity and specificity.
The ggskewboxplots package facilitates distribution-aware visualizations within ggplot2.
Abstract
Traditional boxplots are widely used for summarizing and visualizing the distribution of numerical data, yet they exhibit significant limitations when applied to skewed or heavy-tailed distributions, often leading to misclassification of outliers through swamping -- flagging typical observations as outliers -- or masking -- failing to detect true outliers. This paper addresses these limitations by systematically evaluating several alternative boxplots specifically designed to accommodate distributional asymmetry. We introduce ggskewboxplots, an R package that integrates multiple robust and skewness-aware boxplot variants, providing a unified and user-friendly framework for exploratory data analysis. Using extensive Monte Carlo simulations under controlled skewness and kurtosis conditions, implemented via the mosaic approach based on the Skewed Exponential Power distribution, we assess…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Analysis with R · Advanced Statistical Methods and Models · Complex Systems and Time Series Analysis
