Benchmarking Data Efficiency in $\Delta$-ML and Multifidelity Models for   Quantum Chemistry

Vivin Vinod; Peter Zaspel

arXiv:2410.11391·physics.chem-ph·March 26, 2025

Benchmarking Data Efficiency in $\Delta$-ML and Multifidelity Models for Quantum Chemistry

Vivin Vinod, Peter Zaspel

PDF

Open Access 1 Repo

TL;DR

This paper compares various multifidelity machine learning methods for quantum chemistry, demonstrating that multifidelity approaches generally reduce data costs and improve efficiency, especially for large prediction sets.

Contribution

The study introduces and benchmarks the MFΔML method, showing its advantages over existing Δ-ML and multifidelity methods in quantum chemistry predictions.

Findings

01

Multifidelity methods outperform Δ-ML for large prediction sets.

02

MFΔML is more efficient for applications with fewer evaluations.

03

Multifidelity approaches reduce training data costs compared to single fidelity models.

Abstract

The development of machine learning (ML) methods has made quantum chemistry (QC) calculations more accessible by reducing the compute cost incurred in conventional QC methods. This has since been translated into the overhead cost of generating training data. Increased work in reducing the cost of generating training data resulted in the development of $Δ$ -ML and multifidelity machine learning methods which use data at more than one QC level of accuracy, or fidelity. This work compares the data costs associated with $Δ$ -ML, multifidelity machine learning (MFML), and optimized MFML (o-MFML) in contrast with a newly introduced Multifidelity $Δ$ -Machine Learning (MF $Δ$ ML) method for the prediction of ground state energies, vertical excitation energies, and the magnitude of electronic contribution of molecular dipole moments from the multifidelity benchmark dataset…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SM4DA/MFDeltaML
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management

MethodsSparse Evolutionary Training