Efficient Data Selection Methods for the Development of Machine Learned Potentials
Jan Finkbeiner, Samuel Tovey, Christian Holm

TL;DR
This paper investigates data selection techniques for developing inter-atomic potentials in molecular dynamics, highlighting the effectiveness of atomic-level information methods and analyzing their impact on simulation physicality.
Contribution
It introduces and compares data selection methods incorporating atomic forces or energies for efficient potential development in MD simulations.
Findings
Atomic-level information methods are most efficient for sampling.
Global selection methods can lead to non-physical simulations.
Atomic force-based sampling improves potential accuracy.
Abstract
We present an investigation into data selection methods for the efficient sampling of configuration space as applied to the development of inter-atomic potentials for scale bridging in molecular dynamics (MD) simulations. This investigation suggests that the most efficient sampling techniques are those that incorporate information on an atomic level such as forces or atomic energies. Finally, we generate an inter-atomic potential for the a sodium chloride system using each data selection technique and find that the global selection methods result in non-physical simulations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Protein Structure and Dynamics
