New Metric Formulas that Include Measurement Errors in Machine Learning   for Natural Sciences

Umberto Michelucci; Francesca Venturini

arXiv:2209.15588·cs.LG·October 3, 2022

New Metric Formulas that Include Measurement Errors in Machine Learning for Natural Sciences

Umberto Michelucci, Francesca Venturini

PDF

Open Access

TL;DR

This paper introduces new formulas for evaluating machine learning models that explicitly incorporate measurement errors in data, providing more realistic performance estimates especially in physics and sciences.

Contribution

It derives general, model-independent formulas for common metrics that account for measurement errors, improving the reliability of model evaluation in scientific data analysis.

Findings

01

Formulas provide more pessimistic, realistic metric estimates.

02

Applicable to both regression and classification problems.

03

Valid for any measurement error type and data model.

Abstract

The application of machine learning to physics problems is widely found in the scientific literature. Both regression and classification problems are addressed by a large array of techniques that involve learning algorithms. Unfortunately, the measurement errors of the data used to train machine learning models are almost always neglected. This leads to estimations of the performance of the models (and thus their generalisation power) that is too optimistic since it is always assumed that the target variables (what one wants to predict) are correct. In physics, this is a dramatic deficiency as it can lead to the belief that theories or patterns exist where, in reality, they do not. This paper addresses this deficiency by deriving formulas for commonly used metrics (both for regression and classification problems) that take into account measurement errors of target variables. The new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational Physics and Python Applications · Neural Networks and Applications