Histogram binning revisited with a focus on human perception

Raphael Sahann; Torsten M\"oller; Johanna Schmidt

arXiv:2109.06612·cs.HC·September 15, 2021

Histogram binning revisited with a focus on human perception

Raphael Sahann, Torsten M\"oller, Johanna Schmidt

PDF

TL;DR

This study evaluates how effectively humans perceive data distributions from histograms, revealing optimal bin counts and comparing human perception with existing mathematical binning rules.

Contribution

It provides empirical insights into human perception of histograms and critiques existing binning formulas based on user study results.

Findings

01

More bins generally reduce perception errors up to a point.

02

Existing binning formulas often overestimate necessary bins.

03

Human perception of distribution is limited by bin count.

Abstract

This paper presents a quantitative user study to evaluate how well users can visually perceive the underlying data distribution from a histogram representation. We used different sample and bin sizes and four different distributions (uniform, normal, bimodal, and gamma). The study results confirm that, in general, more bins correlate with fewer errors by the viewers. However, upon a certain number of bins, the error rate cannot be improved by adding more bins. By comparing our study results with the outcomes of existing mathematical models for histogram binning (e.g., Sturges' formula, Scott's normal reference rule, the Rice Rule, or Freedman-Diaconis' choice), we can see that most of them overestimate the number of bins necessary to make the distribution visible to a human viewer.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.