TL;DR
This paper introduces a hierarchy of molecular representations based on quantum mechanics principles, demonstrating that including higher-order terms improves the accuracy of machine learning models for predicting various molecular properties.
Contribution
It proposes a systematic approach to control target similarity in molecular representations using many-body expansions, enhancing ML prediction accuracy for molecular properties.
Findings
Higher-order representation terms improve predictive accuracy.
BAML models outperform existing methods in speed and accuracy.
Systematic inclusion of expansion terms benefits diverse molecular properties.
Abstract
The predictive accuracy of Machine Learning (ML) models of molecular properties depends on the choice of the molecular representation. Based on the postulates of quantum mechanics, we introduce a hierarchy of representations which meet uniqueness and target similarity criteria. To systematically control target similarity, we rely on interatomic many body expansions, as implemented in universal force-fields, including Bonding, Angular, and higher order terms (BA). Addition of higher order contributions systematically increases similarity to the true potential energy and predictive accuracy of the resulting ML models. We report numerical evidence for the performance of BAML models trained on molecular properties pre-calculated at electron-correlated and density functional theory level of theory for thousands of small organic molecules. Properties studied include enthalpies and free…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
