Learning data efficient coarse-grained molecular dynamics from forces and noise
Aleksander E. P. Durumeric, Yaoyi Chen, Frank No\'e, Cecilia Clementi

TL;DR
This paper introduces a unified machine learning approach combining force-based and noise-based methods to significantly reduce data needs in coarse-grained molecular dynamics modeling of biomolecules.
Contribution
It unifies force-based and noise-based learning methods to enhance data efficiency in training coarse-grained molecular dynamics force-fields.
Findings
Reduced data requirements by a factor of 100
Maintained advantages of force-based parameterization
Demonstrated on proteins Trp-Cage and NTL9
Abstract
Machine-learned coarse-grained (MLCG) molecular dynamics is a promising option for modeling biomolecules. However, MLCG models currently require large amounts of data from reference atomistic molecular dynamics or substantial computation for training. Denoising score matching -- the technology behind the widely popular diffusion models -- has simultaneously emerged as a machine-learning framework for creating samples from noise. Models in the first category are often trained using atomistic forces, while those in the second category extract the data distribution by reverting noise-based corruption. We unify these approaches to improve the training of MLCG force-fields, reducing data requirements by a factor of 100 while maintaining advantages typical to force-based parameterization. The methods are demonstrated on proteins Trp-Cage and NTL9 and published as open-source code.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science
