Forces are not Enough: Benchmark and Critical Evaluation for Machine Learning Force Fields with Molecular Simulations
Xiang Fu, Zhenghao Wu, Wujie Wang, Tian Xie, Sinan Keten, Rafael, Gomez-Bombarelli, Tommi Jaakkola

TL;DR
This paper introduces a new benchmark suite for evaluating machine learning force fields in molecular dynamics, emphasizing realistic simulation metrics over traditional force prediction errors, and highlights stability as a key area for improvement.
Contribution
The paper presents a comprehensive benchmark suite with evaluation metrics aligned to scientific objectives, and critically assesses the performance of state-of-the-art ML force fields in realistic MD simulations.
Findings
Force accuracy does not correlate well with simulation quality.
Many SOTA ML force fields struggle with stability during simulations.
The benchmark suite and metrics facilitate targeted improvements.
Abstract
Molecular dynamics (MD) simulation techniques are widely used for various natural science applications. Increasingly, machine learning (ML) force field (FF) models begin to replace ab-initio simulations by predicting forces directly from atomic structures. Despite significant progress in this area, such techniques are primarily benchmarked by their force/energy prediction errors, even though the practical use case would be to produce realistic MD trajectories. We aim to fill this gap by introducing a novel benchmark suite for learned MD simulation. We curate representative MD systems, including water, organic molecules, a peptide, and materials, and design evaluation metrics corresponding to the scientific objectives of respective systems. We benchmark a collection of state-of-the-art (SOTA) ML FF models and illustrate, in particular, how the commonly benchmarked force accuracy is not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Fuel Cells and Related Materials · Protein Structure and Dynamics
