Tree based machine learning framework for predicting ground state energies of molecules
Burak Himmetoglu

TL;DR
This paper demonstrates that boosted regression trees can accurately and efficiently predict molecular ground state energies, outperforming neural networks, and enabling advanced applications in molecular discovery and materials informatics.
Contribution
The study introduces a boosted regression tree framework trained on a large molecular dataset, showing improved accuracy and computational efficiency over neural networks for energy prediction.
Findings
Boosted regression trees outperform neural networks in energy prediction accuracy.
The model achieves significant computational savings compared to traditional methods.
The approach generalizes well to molecules with additional elements like Cl and Si.
Abstract
We present an application of the boosted regression tree algorithm for predicting ground state energies of molecules made up of C, H, N, O, P, and S (CHNOPS). The PubChem chemical compound database has been incorporated to construct a dataset of 16,242 molecules, whose electronic ground state energies have been computed using density functional theory. This dataset is used to train the boosted regression tree algorithm, which allows a computationally efficient and accurate prediction of molecular ground state energies. Predictions from boosted regression trees are compared with neural network regression, a widely used method in the literature, and shown to be more accurate with significantly reduced computational cost. The performance of the regression model trained using the CHNOPS set is also tested on a set of distinct molecules that contain additional Cl and Si atoms. It is shown…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
