Diagnostic Tool for Out-of-Sample Model Evaluation

Ludvig Hult; Dave Zachariah; Petre Stoica

arXiv:2206.10982·stat.ML·October 17, 2023

Diagnostic Tool for Out-of-Sample Model Evaluation

Ludvig Hult, Dave Zachariah, Petre Stoica

PDF

Open Access 1 Repo

TL;DR

This paper introduces a simple, interpretable diagnostic tool that uses finite calibration data to evaluate out-of-sample model performance, providing guarantees and aiding in model selection and hyper-parameter tuning.

Contribution

It proposes a novel, easy-to-compute diagnostic method with finite-sample guarantees for assessing out-of-sample losses under weak assumptions.

Findings

01

Quantifies the impact of distribution shifts.

02

Assists in regression analysis.

03

Supports model selection and hyper-parameter tuning.

Abstract

Assessment of model fitness is a key part of machine learning. The standard paradigm is to learn models by minimizing a chosen loss function averaged over training data, with the aim of achieving small losses on future data. In this paper, we consider the use of a finite calibration data set to characterize the future, out-of-sample losses of a model. We propose a simple model diagnostic tool that provides finite-sample guarantees under weak assumptions. The tool is simple to compute and to interpret. Several numerical experiments are presented to show how the proposed method quantifies the impact of distribution shifts, aids the analysis of regression, and enables model selection as well as hyper-parameter tuning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

el-hult/lal
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Gaussian Processes and Bayesian Inference · Model Reduction and Neural Networks