Bond Energies from a Diatomics-in-Molecules Neural Network
Kun Yao, John Herr, Seth Brown, John Parkhill

TL;DR
This paper introduces a neural network model that predicts molecular energies as sums of bond energies, providing both accurate energy predictions and chemical insights into bond strengths based on molecular environment.
Contribution
The work presents a neural network that predicts bond energies from total molecular energies, offering interpretability and scalability for large molecules, aligning with chemical intuition.
Findings
Achieves a MAE of 0.94 kcal/mol on GDB9 dataset
Predicts relative bond strengths consistent with experimental trends
Learns heuristic bond strength trends similar to expert chemists
Abstract
Neural networks are being used to make new types of empirical chemical models as inexpensive as force fields, but with accuracy close to the ab-initio methods used to build them. Besides modeling potential energy surfaces, neural-nets can provide qualitative insights and make qualitative chemical trends quantitatively predictable. In this work we present a neural-network that predicts the energies of molecules as a sum of bond energies. The network learns the total energies of the popular GDB9 dataset to a competitive MAE of 0.94 kcal/mol. The method is naturally linearly scaling, and applicable to molecules of nanoscopic size. More importantly it gives chemical insight into the relative strengths of bonds as a function of their molecular environment, despite only being trained on total energy information. We show that the network makes predictions of relative bond strengths in good…
| Case # | Chemist | Neural Network | NBO | |
|---|---|---|---|---|
| 1 |
|
|||
| 2 |
|
|||
| 3 |
|
|||
| 4 |
|
|||
| 5 |
|
|||
| 6 |
|
|||
| 7 |
|
|||
| 8 |
|
|||
| 9 |
|
|||
| 10 |
|
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Protein Structure and Dynamics
Intrinsic Bond Energies from a Diatomics-in-Molecules Neural Network
Kun Yao
John E. Herr
Seth N. Brown
John Parkhill
Dept. of Chemistry and Biochemistry, The University of Notre Dame du Lac
Abstract
Neural networks are being used to make new types of empirical chemical models as inexpensive as force fields, but with accuracy similar to the ab-initio methods used to build them. Besides modeling potential energy surfaces, neural networks can provide qualitative insights and make qualitative chemical trends quantitatively predictable. In this work we present a neural network that predicts the energies of molecules as a sum of intrinsic bond energies. The network learns the total energies of the popular GDB9 dataset to a competitive MAE of 0.94 kcal/mol on molecules outside of its training set, is naturally linearly scaling, and applicable to molecules of consisting of thousands of bonds. More importantly it gives chemical insight into the relative strengths of bonds as a function of their molecular environment, despite only being trained on total energy information. We show that the network makes predictions of relative bond strengths in good agreement with measured trends and human predictions. A Diatomics-in-Molecules Neural Network (DIM-NN) learns heuristic relative bond strengths like expert synthetic chemists, and compares well with ab-initio bond order measures such as NBO analysis. {tocentry}
Neural networks (NN) make accurate models of high dimensional functions, with chemical applications such as density functionals1, 2, 3, potential energy surfaces4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 and molecular properties31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58. With modern general-purpose graphical processing unit (GP-GPU) computing, NN’s are inexpensive to train and evaluate, with a cost that lies much closer to a force-field than ab-initio theory. To model energy extensively, most black-box NN schemes partition the energy of a molecule into fragments9, 13, 25, 59. Separate networks are often used for qualitatively different energy contributions to maximize accuracy and efficiency. Besides balancing accuracy with complexity, a decomposition of the energy can yield chemical insights, and give the heuristic principles of chemistry reproducible, quantitative rigor. In this paper we use a NN to express the cohesive energy of a molecule as a sum of bond energies. Besides offering a very accurate and inexpensive decomposition of the total energy, this method, Diatomics-in-Molecules Neural Network (DIM-NN) produces an instant estimate of the embedded unrelaxed bond strengths in a molecule. DIM-NN responds quantitatively to molecular geometry in a way that textbook tabulated values cannot, and updates a textbook concept used by all chemists making it a quantative tool.
Atoms have been a popular choice of NN energy decomposition since the pioneering contributions of Behler, Khaliullin and Parinello9, 10, 4, 15. Within the atom scheme separate networks are trained for each element. Another reasonable choice for non-covalent aggregates explored in our own work is a many-body expansion59. These different fragmentations of the energy navigate a trade-off between breadth and accuracy. One atom’s contribution to the total energy is difficult to learn because it varies significantly when engaged in different bonds. One sort of three-body interaction 25, 59, 28is comparatively easy to learn, but there are too many three-body combinations in chemistry to learn them all. The purpose of this paper is to explore the advantages of bonds as a decomposition unit.
Bonds have served as the central abstraction throughout the history of chemistry60, 61. They are a popular building block for Neural Network inputs32, but to our knowledge no NN has been used to predict total energies as a sum of bond energies as we are about to describe. One reason for this might be the large, but manageable, number of bond networks required to make a general model chemistry. A complex neural network is required for DIM-NN with many bond-branches that must be dynamically learned and evaluated. We developed a general open-source software framework for producing NN models62 of molecules, TensorMol in which we have implemented DIM-NN, that simplifies the process of creating the bond-centered network. The complete source allowing readers to reproduce and extend this work is publicly available in the TensorMol repository63.
The left panel of Fig. 1 schematically describes how DIM-NN is trained and evaluated. A molecule is broken down into overlapping diatomic fragments, such as C-H, C-C, C-O, etc. An optimal choice of descriptor describing the chemical environment of the bond is crucial for neural networks to reach their best performance10, 64, 17, 31, 65, 50, 65. This work uses our own version of the Bag of Bonds (BoBs)32, 31 descriptor, which is similar to BALM 66 for this purpose. The descriptor contains the bond length of the target bond, and lengths and angles of bonds attached to it in order of connectivity. Each type of bond has its own branch consisting of three fully-connected hidden layers with 1000 neurons in each layer summed to the bond energy. The energies of all bonds are summed up at the last layer to give a predicted molecular total energy, which is the training target. Backpropagation of errors67 to the previous layers allows the bond branches to learn the strengths of their specific bond types. These errors propagate back through a linear transformation matrix that maps the many bond energies coming from different molecules produced by a type branch back onto the molecules. We use the popular GDB9 database68, 55 including H, C, N and O for our training data, which is more than 130,000 molecules in total. Before training, 20% percent of the database was chosen randomly as a test set and kept independent. Our results also examine larger molecules, which are not a part of GDB9. Further methodological details are presented in the supplement.
Our results try to answer three questions we had about DIM-NN: how well does DIM-NN predict total energies, how well does DIM-NN predict bond strength trends, and what is the relationship between DIM-NN bond energies and observed chemical reactivity. We were especially excited about DIM-NN as a method to systematically and quantitatively reproduce chemical heuristics, and appealed to expert synthetic chemists to make predictions of relative bond strengths to test DIM-NN. Because DIM-NN predicts the energy of a molecule relative to free neutral atoms, we expect bond strengths in DIM-NN to correlate best with unrelaxed, heterolytic dissociation energies and some evidence to this effect is presented later on. However, no specific products are implied by DIM-NN bond energies. They depend only on the equilibrium geometry. Some sorts of bonds are strong near equilibrium but can kinetically access their products which are relatively more stable, or electronically relax, and in these cases a chemist’s intuition may be at variance with DIM-NN.
The training mean absolute error (vs. B3LYP/6-31G(2df,p) ) of the molecular energies of our DIM-NN reaches 0.86 kcal/mol and the independent test mean absolute error is 0.94 kcal/mol. The test error is close to the training error, suggesting that the our model is neither over-fitting nor under-fitting. The accuracy of our model is competitive with the state-of-art sub-1 kcal/mol accuracy on this dataset.32, 66, 50, 48, 69, 6, 65, 70. The bond-wise nature of our model makes it transferable to large molecules. The right panel of Fig. 1 shows DIM-NN predicted total energy and the calculated DFT total energy of morphine molecule, which is not in our training set and contains 21 heavy atoms. We also tested DIM-NN on vitamins D2 and B5. The difference between DIM-NN and DFT energies of these two molecules are 0.6 kcal/mol and 1.2 kcal/mol, respectively. All these errors are small relative to the inherent errors of the B3LYP model chemistry used to produce DIM-NN. Our model can be trivially trained on higher quality chemical data as it becomes available. Correct reproduction of the bond trends described later on depends sensitively on the accuracies of the total energies. Before completion of the training process when total energies are in error by roughly 5 kcal/mol, more than 40% of the examples in Table 1 and Table S1 are answered incorrectly by DIM-NN. This is due to the fact that errors can accumulate in bonds and cancel in the molecular energy unless the NN is tightly trained on a broad sample of chemical space.
Our scheme not only calculates the total energy of a molecule, but also predicts the strengths of bonds individually. Fig. 1 shows the bonds in the morphine molecule drawn with colors keyed to the bond strengths predicted by DIM-NN. The slight energy differences predicted by DIM-NN between bonds with identical bonding but different environment are perceptible. DIM-NN can suggest how the stress is distributed in an unstable isomer compared with the stable one. Fig. 3 shows the geometry of cis- and *trans-*bicyclo[4.1.0]heptane. The DFT calculation shows that cis- structure is 26.3 kcal/mol more stable than the trans- structure. DIM-NN predicts the energy difference is 24.6 kcal/mol, in good agreement for this molecule outside GDB9. Fig. 3 also shows how the stress is distributed in the trans- structure. Within the carbon framework, the strain is distributed rather equally among the bonds in the cyclohexane ring. Some C-H bonds become weaker, especially those attached to at the ring junction carbons, and some become stronger.
Fig. 2 shows the bond energies of all the carbon-carbon bonds in the GDB9 database with respect to their bond lengths. Different bond types (single, double, triple and conjugated versions of those) are indicated by color. The spread of each clouds is caused by the different chemical environments of the bonds. Bond strength correlates predictably with bond order, but general chemical trends which are less obvious also revealed. For example energies of single bonds are more sensitive to the environment than those of double or triple bonds. DIM-NN also predicts that bond lengths are roughly as relevant to their energies as all other factors combined.
DIM-NN predicts that a bi-cyclic C-C single bond shared by a three-membered carbon ring and a four-membered carbon ring is extremely weak, consistent with chemical intuition. The strongest C-C single bond predicted by DIM-NN is the bond that is connected with C-C triple bonds, in agreement with textbook bond dissociation energies71, 72. DIM-NN bond energies reproduce several other established chemical rules of thumb, for example that the strength of a CH bond decreases in the series: methyl carbon, primary carbon in ethane, secondary carbon in propane, tertiary carbon in isobutane. The bond energies of these four types of carbon-hydrogen bonds predicted by DIM-NN are 105.2 kcal/mol, 105.0 kcal/mol, 95.2 kcal/mol and 89.8 kcal/mol respectively, where the experimental value are 105.1 kcal/mol, 98.2 kcal/mol, 95.1 kcal/mol and 93.2 kcal/mol, respectively71. DIM-NN also predicts that the C-C single bond in pyrrole is 1.2 kcal/mol more stable than the C-C single bond in furan, which agrees with greater bond delocalization in pyrrole than in furan, consistent with its larger NICS aromaticity73, 74.
We asked a synthetic colleague for a quiz which consists of 19 pairs of bond strength comparisons. We compare the predictions of relative bond strength made by the chemist, natural bond orbital analysis (NBO)75, 76, 77 and DIM-NN. Table 1 and Table S1 shows that the problems in this quiz range in difficulty from pairs separated by 10 kcal/mol to subtle differences on the order of which challenge the density functional data used to produce DIM-NN. NBO makes predictions disagreeing with the chemist in five cases, while DIM-NN disagrees with chemist on two cases: case 3 and case S6. Both of these two cases are comparisons of C-H bond strength where the carbon atom is connected to an oxygen atom and NBO also dissents with the chemist in these cases. We believe this disagreement is due to the fact that both NBO and the DIM-NN scheme do not relax the electronic structure of a molecule following bond cleavage, while a chemist takes into account the stabilization of the dissociated radicals.
To further corroborate the interpretation of DIM-NN bond energies as unrelaxed homolytic bond dissociation energies (BDEs), we have directly compared DIM-NN bond energies with experiment and DFT. The mean absolute difference of BDEs relative to experimental best estimates from DIM-NN, geometrically unrelaxed, and relaxed DFT are 12.8, 17.2, and 8.3 kcal/mol respectively (Table S2). The unrelaxed DFT BDEs use the geometry of the molecule, but relax the electronic wavefunction. Please note that by design DIM-NN does not include either electronic or geometric relaxation energy. The fact that DIM-NN outperforms unrelaxed DFT may indicate a cancellation of electronic and geometric relaxation energies.
DIM-NN bond energies are very close to experimental BDEs for simple alkanes which do not undergo significant electronic or geometric relaxation (methyl, cyclopentyl, ethyl, cyclopropyl, t-butyl, , etc.), even closer to experiment than DFT. The DIM-NN bond energy is systematically larger than the experimental BDE when the product radical is electronically stabilized (allyl, tolyl, cyclopentadienyl, 1,4 cyclohexyldienyl, and bonds to oxygen). The difference in these cases can be interpreted as the electronic relaxation energy of the radical when the difference between relaxed and unrelaxed DFT is small.
To investigate the electronic relaxation effect we performed embedded DFT calculations of the relative CH BDEs in the tetrahydrofuran and actealdehyde-propene examples from Table 1, and estimated the relaxation effect by comparing the difference of the fragment reaction energies frozen at their electronic configuration in the molecule78. In both cases the frozen-embedded calculation predicts the same ordering as DIM-NN and NBO. DIM-NN successfully learns many important classes of chemical heuristics such as bond types (case 1, case 4), geometric stresses (case 7, case 8) and conjugation (case 2). The interpretation consistent with these results is that DIM-NN produces intrinsic bond energies, which can be used to separate the stabilization of products from the intrinsic stabilization of a bond.
We have presented a method to cheaply and accurately sum-up a molecular total energy as a ensemble of bond-energies, DIM-NN. The method can be thought of as a quantitative version of the textbook bond energy table all chemists think about and rely upon to understand molecules. Chemists could use DIM-NN to study how small changes in geometry might affect the strength of a bond, and produce quantitative numbers to match their qualitative intuition. The method also produces accurate total energies without any sort of sophisticated non-local interaction between the bonds. This shows that bond networks could contribute to a general neural network model chemistry. This ambitious long term goal merits future work, for example extending our bond decomposition with additional non-covalent contributions that describe correct long-range forces79. Because of TensorMol’s flexibility, these extensions fit easily within our decomposition scheme. A general model chemistry also requires more diverse sampling of chemical space and a broader swath of the periodic table, and this work is underway in our laboratory. Accurate decomposition of bonds which always occur in pairs for example the two bonds in a terminal alkyne will benefit significantly from additional non-equilibrium geometrical data. Interested parties may download our source and train their own generalized and improved DIM-NN models.
{acknowledgement}
The authors would like to thank Prof. Xavier Creary for valuable discussions.
{suppinfo}
TThe Supporting Information is available free of charge via the Internet at http://pubs.acs.org/.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Snyder et al. 2013 Snyder, J. C.; Rupp, M.; Hansen, K.; Blooston, L.; Müller, K.-R.; Burke, K. Orbital-free bond breaking via machine learning. J. Chem. Phys. 2013 , 139 , 224104
- 2Snyder et al. 2012 Snyder, J. C.; Rupp, M.; Hansen, K.; Müller, K.-R.; Burke, K. Finding density functionals with machine learning. Phys. Rev. Lett. 2012 , 108 , 253002
- 3Yao and Parkhill 2016 Yao, K.; Parkhill, J. Kinetic Energy of Hydrocarbons as a Function of Electron Density and Convolutional Neural Networks. J. Chem. Theory Comput. 2016 , 12 , 1139–1147
- 4Behler 2011 Behler, J. Neural network potential-energy surfaces in chemistry: a tool for large-scale simulations. Phys. Chem. Chem. Phys. 2011 , 13 , 17930–17955
- 5Handley and Popelier 2010 Handley, C. M.; Popelier, P. L. Potential energy surfaces fitted by artificial neural networks. J. Phys. Chem. A 2010 , 114 , 3371–3383
- 6Schütt et al. 2017 Schütt, K. T.; Arbabzadah, F.; Chmiela, S.; Müller, K. R.; Tkatchenko, A. Quantum-chemical insights from deep tensor neural networks. Nat. Commun. 2017 , 8 , 13890 EP –
- 7Chmiela et al. 2016 Chmiela, S.; Tkatchenko, A.; Sauceda, H. E.; Poltavsky, I.; Schütt, K.; Müller, K.-R. Machine Learning of Accurate Energy-Conserving Molecular Force Fields. ar Xiv preprint ar Xiv:1611.04678 2016 ,
- 8Tian et al. 2016 Tian, Y.; Yan, X.; Saha, M. L.; Niu, Z.; Stang, P. J. Hierarchical Self-Assembly of Responsive Organoplatinum (II) Metallacycle–TMV Complexes with Turn-On Fluorescence. J. Am. Chem. Soc. 2016 , 138 , 12033–12036
