Nutmeg and SPICE: Models and Data for Biomolecular Machine Learning
Peter Eastman, Benjamin P. Pritchard, John D. Chodera, Thomas E., Markland

TL;DR
This paper introduces the Nutmeg machine learning potentials trained on the expanded SPICE dataset, demonstrating accurate energy predictions and stable molecular dynamics for charged and large molecules, advancing biomolecular modeling.
Contribution
It presents a new version of the SPICE dataset with extensive chemical space sampling and a novel Nutmeg model architecture that improves performance on charged molecules.
Findings
Nutmeg models accurately reproduce energy differences between conformations.
Models generate stable molecular dynamics trajectories.
Models are computationally efficient for routine simulations.
Abstract
We describe version 2 of the SPICE dataset, a collection of quantum chemistry calculations for training machine learning potentials. It expands on the original dataset by adding much more sampling of chemical space and more data on non-covalent interactions. We train a set of potential energy functions called Nutmeg on it. They are based on the TensorNet architecture. They use a novel mechanism to improve performance on charged and polar molecules, injecting precomputed partial charges into the model to provide a reference for the large scale charge distribution. Evaluation of the new models shows they do an excellent job of reproducing energy differences between conformations, even on highly charged molecules or ones that are significantly larger than the molecules in the training set. They also produce stable molecular dynamics trajectories, and are fast enough to be useful for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks
MethodsSparse Evolutionary Training
