Histogram lies about distribution shape and Pearson's coefficient of variation lies about variability
Paulo S. P. Silveira, Jose O. Siqueira

TL;DR
This paper critically examines traditional statistical tools like histograms and Pearson's coefficient of variation, demonstrating their flaws and proposing density plots and Eisenhauer's relative dispersion coefficient as better alternatives for analyzing data distribution and variability.
Contribution
The paper reveals limitations of histograms and coefficient of variation and introduces density plots and Eisenhauer's coefficient as improved methods for data analysis.
Findings
Histograms can misrepresent distribution shapes.
Coefficient of variation is not invariant under linear transformations.
Density plots and Eisenhauer's coefficient provide more accurate insights.
Abstract
Background and Objective: Histograms and Pearson's coefficient of variation are among the most popular summary statistics. Researchers use histograms to judge the shape of quantitative data distribution by visual inspection. The coefficient of variation is taken as an estimator of relative variability of these data. We explore properties of histograms and coefficient of variation by examples in R, thus offering better alternatives: density plots and Eisenhauer's relative dispersion coefficient. Methods: Hypothetical examples developed in R are applied to create histograms and density plots, and to compute coefficient of variation and relative dispersion coefficient. Results: These hypothetical examples clearly show that these two traditional approaches are flawed. Histograms do not necessarily reflect the distribution of probabilities and the Pearson's coefficient of variation is not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Analysis with R
