Uncertainty-Aware Diagnostics for Physics-Informed Machine Learning

Mara Daniels; Liam Hodgkinson; Michael Mahoney

arXiv:2510.26121·stat.ML·October 31, 2025

Uncertainty-Aware Diagnostics for Physics-Informed Machine Learning

Mara Daniels, Liam Hodgkinson, Michael Mahoney

PDF

3 Reviews

TL;DR

This paper introduces the PILE score, a new uncertainty-aware metric within Gaussian process regression for physics-informed machine learning, improving model selection and understanding of epistemic uncertainty.

Contribution

The paper proposes the PILE score, a novel single metric for hyperparameter tuning in PIML that addresses ambiguity in model quality measurement and uncertainty estimation.

Findings

01

PILE score effectively guides hyperparameter selection.

02

Data-free PILE identifies well-adapted kernels for PDEs.

03

PILE score enhances understanding of epistemic uncertainty.

Abstract

Physics-informed machine learning (PIML) integrates prior physical information, often in the form of differential equation constraints, into the process of fitting machine learning models to physical data. Popular PIML approaches, including neural operators, physics-informed neural networks, neural ordinary differential equations, and neural discrete equilibria, are typically fit to objectives that simultaneously include both data and physical constraints. However, the multi-objective nature of this approach creates ambiguity in the measurement of model quality. This is related to a poor understanding of epistemic uncertainty, and it can lead to surprising failure modes, even when existing statistical metrics suggest strong fits. Working within a Gaussian process regression framework, we introduce the Physics-Informed Log Evidence (PILE) score. Bypassing the ambiguities of test losses,…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 8Confidence 3

Strengths

The paper offers a novel contribution to physics-informed machine learning by introducing PILE, a principled and uncertainty-aware model selection criterion based on Bayesian marginal likelihood. Unlike existing PIML approaches that rely on heuristic loss weighting or manual hyperparameter tuning, this work reframes the problem through Bayesian evidence maximization, providing a single score that jointly captures data-fit, physics-fit, model complexity, and uncertainty calibration. A particularl

Weaknesses

While the paper is strong overall, a few areas could be improved to enhance its practical impact. The work is primarily method-driven and introduces a novel and well-motivated framework, and I did not identify major weaknesses in the core methodology or theoretical development. My comments are therefore more about opportunities to further strengthen the empirical validation. The experiments are limited to relatively low-dimensional PDEs with simple settings e.g., the main results focus on a 1D P

Reviewer 02Rating 6Confidence 3

Strengths

- The multi-objective evaluation issue is an important problem in PIML that practitioners struggle with. Having a principled diagnostic is valuable. - A single metric for hyperparameter selection is very usable (as opposed to juggling multiple competing objectives such as data loss vs. physics loss vs. test error).

Weaknesses

- The current set-up is limited to Gaussian processes (not major, but it would obviously great to have something model agnostic). - This also limits the problems in which PILE would be useful to rather smaller scale problems, as GPs don't scale well. The PILE score itself introduces additional complexity. Hence, it is not fully clear what is practical limitations in terms of the computational cost and up to what degree it would be feasible to use it in practice.

Reviewer 03Rating 4Confidence 3

Strengths

1. The paper addresses an important topic for incorporating the knowledge into data-driven learning. 2. Thorough theoretical treatment of the problem.

Weaknesses

1. Applicable to kernel learning only. It's still an open problem whether it can be extended to other ML techniques especially neural networks. Although kernel learning is highly capable, there are still a large selection of NN-based ML methods which would greatly benefit from physics knowledge. 2. The organization and exposition of the paper can be further improved. It is good to be mathematically rigorous, the exposition can be improved to give the readers more intuition, instead of piles of m

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.