Massively Parallel Fitting of Gaussian Approximation Potentials

Sascha Klawohn; James R. Kermode; Albert P. Bart\'ok

arXiv:2207.03803·cond-mat.mtrl-sci·November 14, 2022·Mach. Learn. Sci. Technol.

Massively Parallel Fitting of Gaussian Approximation Potentials

Sascha Klawohn, James R. Kermode, Albert P. Bart\'ok

PDF

Open Access

TL;DR

This paper introduces a scalable, parallel software for fitting Gaussian Approximation Potentials that overcomes memory limitations and accelerates training on large datasets using high-performance computing techniques.

Contribution

A new parallel implementation of GAP fitting that scales to thousands of cores, enabling larger datasets and more complex systems.

Findings

01

Scales to thousands of cores with no communication overhead.

02

Lifts memory limitations for training set size.

03

Provides substantial speedups in model fitting.

Abstract

We present a data-parallel software package for fitting Gaussian Approximation Potentials (GAPs) on multiple nodes using the ScaLAPACK library with MPI and OpenMP. Until now the maximum training set size for GAP models has been limited by the available memory on a single compute node. In our new implementation, descriptor evaluation is carried out in parallel with no communication requirement. The subsequent linear solve required to determine the model coefficients is parallelised with ScaLAPACK. Our approach scales to thousands of cores, lifting the memory limitation and also delivering substantial speedups. This development expands the applicability of the GAP approach to more complex systems as well as opening up opportunities for efficiently embedding GAP model fitting within higher-level workflows such as committee models or hyperparameter optimisation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Machine Learning and Data Classification · Scientific Research and Discoveries