Fitting Bell Curves to Data Distributions using Visualization
Eric Newburger, Michael Correll, Niklas Elmqvist

TL;DR
This study investigates how well people can visually fit normal curves to data distributions using different visualization types, revealing insights into human perception and the effectiveness of various visual representations.
Contribution
It provides empirical evidence on human ability to match data distributions with idealized curves across multiple visualization techniques.
Findings
People can estimate the mean with some accuracy and little bias.
Individuals tend to overestimate the standard deviation, called the 'umbrella effect'.
Strip plots yield the most accurate curve fitting among tested visualizations.
Abstract
Idealized probability distributions, such as normal or other curves, lie at the root of confirmatory statistical tests. But how well do people understand these idealized curves? In practical terms, does the human visual system allow us to match sample data distributions with hypothesized population distributions from which those samples might have been drawn? And how do different visualization techniques impact this capability? This paper shares the results of a crowdsourced experiment that tested the ability of respondents to fit normal curves to four different data distribution visualizations: bar histograms, dotplot histograms, strip plots, and boxplots. We find that the crowd can estimate the center (mean) of a distribution with some success and little bias. We also find that people generally overestimate the standard deviation, which we dub the "umbrella effect" because people tend…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
