On the Effect of Aleatoric and Epistemic Errors on the Learnability and Quality of NN-based Potential Energy Surfaces
S. Goswami, S. K\"aser, R. J. Bemish, M. Meuwly

TL;DR
This paper investigates how aleatoric and epistemic noise in quantum chemical data affect the training and quality of neural network-based potential energy surfaces, highlighting differences between simple and complex molecules.
Contribution
It provides a systematic analysis of the impact of different noise types on PES learning, emphasizing the importance of force accuracy and multi-reference effects.
Findings
Noise in energies has limited impact on simple molecules like H2CO.
Force noise significantly affects PES quality, especially in complex systems.
Multi-reference character correlates with model deterioration under noise.
Abstract
The effect of noise in the input data for learning potential energy surfaces (PESs) based on neural networks for chemical applications is assessed. Noise in energies and forces can result from aleatoric and epistemic errors in the quantum chemical reference calculations. Statistical (aleatoric) noise arises for example due to the need to set convergence thresholds in the self consistent field (SCF) iterations whereas systematic (epistemic) noise is due to, {\it inter alia}, particular choices of basis sets in the calculations. The two molecules considered here as proxies are HCO and HONO which are examples for single- and multi-reference problems, respectively, for geometries around the minimum energy structure. For HCO it is found that adding noise to energies with magnitudes representative of single-point calculations does not deteriorate the quality of the final PESs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Topic Modeling
