Scalable Training of Trustworthy and Energy-Efficient Predictive Graph Foundation Models for Atomistic Materials Modeling: A Case Study with HydraGNN
Massimiliano Lupo Pasini, Jong Youl Choi, Kshitij Mehta, Pei Zhang,, David Rogers, Jonghyun Bae, Khaled Z. Ibrahim, Ashwin M. Aji, Karl W. Schulz,, Jorda Polo, Prasanna Balaprakash

TL;DR
This paper introduces HydraGNN, a scalable and energy-efficient graph neural network framework for atomistic materials modeling, capable of training on tens of thousands of GPUs with high performance and data diversity.
Contribution
It presents HydraGNN, a multi-headed GNN architecture that scales training of large GFMs to hundreds of millions of graphs across supercomputers, enabling advanced materials modeling.
Findings
Scales training to over 154 million graphs using 16,000 GPUs.
Achieves near-linear strong scaling on US-DOE supercomputers.
Demonstrates effective multi-task learning for atomistic property prediction.
Abstract
We present our work on developing and training scalable, trustworthy, and energy-efficient predictive graph foundation models (GFMs) using HydraGNN, a multi-headed graph convolutional neural network architecture. HydraGNN expands the boundaries of graph neural network (GNN) computations in both training scale and data diversity. It abstracts over message passing algorithms, allowing both reproduction of and comparison across algorithmic innovations that define nearest-neighbor convolution in GNNs. This work discusses a series of optimizations that have allowed scaling up the GFMs training to tens of thousands of GPUs on datasets consisting of hundreds of millions of graphs. Our GFMs use multi-task learning (MTL) to simultaneously learn graph-level and node-level properties of atomistic structures, such as energy and atomic forces. Using over 154 million atomistic structures for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Fuel Cells and Related Materials
MethodsHyper-parameter optimization · Early Stopping · Graph Neural Network · Convolution
