Assessing Non-Nested Configurations of Multifidelity Machine Learning for Quantum-Chemical Properties
Vivin Vinod, Peter Zaspel

TL;DR
This paper evaluates the effectiveness of non-nested training data configurations in multifidelity machine learning models for quantum-chemical properties, finding that optimized MFML can perform well without nested data requirements.
Contribution
It demonstrates that the optimized MFML method can effectively handle non-nested training data, unlike traditional MFML which requires nested data structures.
Findings
MFML requires nested data for optimal performance.
o-MFML performs well with non-nested data, comparable to nested configurations.
Results are based on predicting energies of diverse molecules in the CheMFi dataset.
Abstract
Multifidelity machine learning (MFML) for quantum chemical (QC) properties has seen strong development in the recent years. The method has been shown to reduce the cost of generating training data for high-accuracy low-cost ML models. In such a set-up, the ML models are trained on molecular geometries and some property of interest computed at various computational chemistry accuracies, or fidelities. These are then combined in training the MFML models. In some multifidelity models, the training data is required to be nested, that is the same molecular geometries are included to calculate the property across all the fidelities. In these multifidelity models, the requirement of a nested configuration restricts the kind of sampling that can be performed while selection training samples at different fidelities. This work assesses the use of non-nested training data for two of these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
