A Plot is Worth a Thousand Tests: Assessing Residual Diagnostics with   the Lineup Protocol

Weihao Li; Dianne Cook; Emi Tanaka; Susan VanderPlas

arXiv:2308.05964·stat.AP·November 4, 2024·J. Comput. Graph. Stat.·1 cites

A Plot is Worth a Thousand Tests: Assessing Residual Diagnostics with the Lineup Protocol

Weihao Li, Dianne Cook, Emi Tanaka, Susan VanderPlas

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that visual residual diagnostics using the lineup protocol provide more reliable and comprehensive model assessment than traditional numerical tests, which can be overly sensitive or insensitive.

Contribution

It introduces the lineup protocol as an effective visual inference method for residual diagnostics, improving model assessment accuracy over conventional tests.

Findings

01

Lineup protocol yields more reliable residual diagnostics.

02

Traditional tests are often too sensitive or insensitive.

03

Visual inference detects multiple residual issues simultaneously.

Abstract

Regression experts consistently recommend plotting residuals for model diagnosis, despite the availability of many numerical hypothesis test procedures designed to use residuals to assess problems with a model fit. Here we provide evidence for why this is good advice using data from a visual inference experiment. We show how conventional tests are too sensitive, which means that too often the conclusion would be that the model fit is inadequate. The experiment uses the lineup protocol which puts a residual plot in the context of null plots. This helps generate reliable and consistent reading of residual plots for better model diagnosis. It can also help in an obverse situation where a conventional test would fail to detect a problem with a model due to contaminated data. The lineup protocol also detects a range of departures from good residuals simultaneously. Supplemental materials for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tengmcing/lineup_residual_diagnostics
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRisk and Safety Analysis