Beware of Validation by Eye: Visual Validation of Linear Trends in Scatterplots
Daniel Braun, Remco Chang, Michael Gleicher, Tatiana von Landesberger

TL;DR
This study empirically evaluates how accurately people can visually validate linear regression models in scatterplots, revealing biases and the limited effectiveness of common visualization enhancements, thus urging caution in visual validation practices.
Contribution
It provides the first empirical analysis of human ability to visually validate linear trends and assesses the impact of visualization design choices on validation accuracy.
Findings
Participants are more accurate estimating slopes than validating shown lines.
There is a bias toward perceiving slopes as too steep.
Common visualization enhancements did not significantly improve validation accuracy.
Abstract
Visual validation of regression models in scatterplots is a common practice for assessing model quality, yet its efficacy remains unquantified. We conducted two empirical experiments to investigate individuals' ability to visually validate linear regression models (linear trends) and to examine the impact of common visualization designs on validation quality. The first experiment showed that the level of accuracy for visual estimation of slope (i.e., fitting a line to data) is higher than for visual validation of slope (i.e., accepting a shown line). Notably, we found bias toward slopes that are "too steep" in both cases. This lead to novel insights that participants naturally assessed regression with orthogonal distances between the points and the line (i.e., ODR regression) rather than the common vertical distances (OLS regression). In the second experiment, we investigated whether…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Regression
