Building effective models from sparse but precise data
Eric Cockayne, Axel van de Walle

TL;DR
This paper proposes a Bayesian approach to model building from sparse, precise data, emphasizing overfitting models that incorporate physical knowledge to improve accuracy and error estimation in computational science applications.
Contribution
It introduces a Bayesian framework for constructing models from noise-free, precise data, shifting from traditional underfitting to overfitting strategies for better physical representation.
Findings
Bayesian models reproduce original data exactly.
Error estimates are provided for unseen systems.
Application to Ca[Zr,Ti]O3 demonstrates effectiveness.
Abstract
A common approach in computational science is to use a set of of highly precise but expensive calculations to parameterize a model that allows less precise, but more rapid calculations on larger scale systems. Least-squares fitting on a model that underfits the data is generally used for this purpose. For arbitrarily precise data free from statistic noise, e.g. ab initio calculations, we argue that it is more appropriate to begin with a ensemble of models that overfit the data. Within a Bayesian framework, a most likely model can be defined that incorporates physical knowledge, provides error estimates for systems not included in the fit, and reproduces the original data exactly. We apply this approach to obtain a cluster expansion model for the Ca[Zr,Ti]O3 solid solution.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
