Does AI for science need another ImageNet Or totally different   benchmarks? A case study of machine learning force fields

Yatao Li; Wanling Gao; Lei Wang; Lixin Sun; Zun Wang; Jianfeng Zhan

arXiv:2308.05999·cs.LG·August 14, 2023

Does AI for science need another ImageNet Or totally different benchmarks? A case study of machine learning force fields

Yatao Li, Wanling Gao, Lei Wang, Lixin Sun, Zun Wang, Jianfeng Zhan

PDF

Open Access

TL;DR

This paper argues that AI for science requires specialized benchmarks that reflect real-world scientific challenges, demonstrated through a case study on machine learning force fields for molecular dynamics simulations.

Contribution

It introduces a novel benchmarking approach tailored for AI4S, focusing on sample efficiency, time sensitivity, and generalization, improving evaluation relevance for scientific applications.

Findings

01

Traditional benchmarks are inadequate for AI4S due to out-of-distribution challenges.

02

Proposed metrics better assess models' real-world scientific performance.

03

The benchmark suite enhances evaluation of ML models in scientific contexts.

Abstract

AI for science (AI4S) is an emerging research field that aims to enhance the accuracy and speed of scientific computing tasks using machine learning methods. Traditional AI benchmarking methods struggle to adapt to the unique challenges posed by AI4S because they assume data in training, testing, and future real-world queries are independent and identically distributed, while AI4S workloads anticipate out-of-distribution problem instances. This paper investigates the need for a novel approach to effectively benchmark AI for science, using the machine learning force field (MLFF) as a case study. MLFF is a method to accelerate molecular dynamics (MD) simulation with low computational cost and high accuracy. We identify various missed opportunities in scientifically meaningful benchmarking and propose solutions to evaluate MLFF models, specifically in the aspects of sample efficiency, time…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science · Explainable Artificial Intelligence (XAI) · Machine Learning and Data Classification

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings