Prediction of Atomization Energy Using Graph Kernel and Active Learning
Yu-Hang Tang, Wibe A. de Jong

TL;DR
This paper introduces a graph kernel and active learning framework for accurately predicting molecular atomization energies, effectively handling data complexity and confidence estimation with high precision.
Contribution
It presents a novel kernel-based pipeline using Gaussian process regression and active learning for molecular property prediction, emphasizing efficiency and accuracy.
Findings
Achieved a mean absolute error of 0.62 kcal/mol with 2000 training samples
Demonstrated the suitability of the graph kernel for extensive property prediction
Analyzed hyperparameters' effects on accuracy and confidence
Abstract
Data-driven prediction of molecular properties presents unique challenges to the design of machine learning methods concerning data structure/dimensionality, symmetry adaption, and confidence management. In this paper, we present a kernel-based pipeline that can learn and predict the atomization energy of molecules with high accuracy. The framework employs Gaussian process regression to perform predictions based on the similarity between molecules, which is computed using the marginalized graph kernel. To apply the marginalized graph kernel, a spatial adjacency rule is first employed to convert molecules into graphs whose vertices and edges are labeled by elements and interatomic distances, respectively. We then derive formulas for the efficient evaluation of the kernel. Specific functional components for the marginalized graph kernel are proposed, while the effect of the associated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
