Histogram binning revisited with a focus on human perception
Raphael Sahann, Torsten M\"oller, Johanna Schmidt

TL;DR
This study evaluates how effectively humans perceive data distributions from histograms, revealing optimal bin counts and comparing human perception with existing mathematical binning rules.
Contribution
It provides empirical insights into human perception of histograms and critiques existing binning formulas based on user study results.
Findings
More bins generally reduce perception errors up to a point.
Existing binning formulas often overestimate necessary bins.
Human perception of distribution is limited by bin count.
Abstract
This paper presents a quantitative user study to evaluate how well users can visually perceive the underlying data distribution from a histogram representation. We used different sample and bin sizes and four different distributions (uniform, normal, bimodal, and gamma). The study results confirm that, in general, more bins correlate with fewer errors by the viewers. However, upon a certain number of bins, the error rate cannot be improved by adding more bins. By comparing our study results with the outcomes of existing mathematical models for histogram binning (e.g., Sturges' formula, Scott's normal reference rule, the Rice Rule, or Freedman-Diaconis' choice), we can see that most of them overestimate the number of bins necessary to make the distribution visible to a human viewer.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
