A General Model Validation and Testing Tool

Kevin Vanslette; Tony Tohme; Kamal Youcef-Toumi

arXiv:1908.11251·stat.ME·October 28, 2019·Reliab. Eng. Syst. Saf.

A General Model Validation and Testing Tool

Kevin Vanslette, Tony Tohme, Kamal Youcef-Toumi

PDF

TL;DR

The paper introduces the Bayesian Validation Metric (BVM), a comprehensive tool that unifies and extends existing validation metrics, enabling more flexible and quantifiable model validation and comparison.

Contribution

It proposes the BVM as a universal validation framework capable of representing all standard metrics and introducing novel compound metrics for improved model assessment.

Findings

01

BVM can represent all standard validation metrics as special cases.

02

New compound validation metrics outperform existing ones in certain scenarios.

03

The BVM Ratio quantifies model selection under uncertainty.

Abstract

We construct and propose the "Bayesian Validation Metric" (BVM) as a general model validation and testing tool. We find the BVM to be capable of representing all of the standard validation metrics (square error, reliability, probability of agreement, frequentist, area, probability density comparison, statistical hypothesis testing, and Bayesian model testing) as special cases and find that it can be used to improve, generalize, or further quantify their uncertainties. Thus, the BVM allows us to assess the similarities and differences between existing validation metrics in a new light. The BVM has the capacity to allow users to invent and select models according to novel validation requirements. We formulate and test a few novel compound validation metrics that improve upon other validation metrics in the literature. Further, we construct the BVM Ratio for the purpose of quantifying…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.