Cross validation residuals for generalised least squares and other correlated data models
Ingrid Annette Baade

TL;DR
This paper extends leave-M-out cross validation to generalized least squares models, demonstrating relationships with Cook's distance and showing that refitting is unnecessary after initial model fitting.
Contribution
It introduces a method to compute cross validation residuals for generalized least squares without refitting, linking them to Cook's distance.
Findings
Cross validation residuals relate to Cook's distance.
No refitting needed after initial model fit for cross validation.
Method applies to correlated data models like GLS.
Abstract
Cross validation residuals are well known for the ordinary least squares model. Here leave-M-out cross validation is extended to generalised least squares. The relationship between cross validation residuals and Cook's distance is demonstrated, in terms of an approximation to the difference in the generalised residual sum of squares for a model fit to all the data (training and test) and a model fit to a reduced dataset (training data only). For generalised least squares, as for ordinary least squares, there is no need to refit the model to reduced size datasets as all the values for K fold cross validation are available after fitting the model to all the data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical and numerical algorithms · Spectroscopy and Chemometric Analyses
