Scalable training of graph convolutional neural networks for fast and accurate predictions of HOMO-LUMO gap in molecules
Jong Youl Choi, Pei Zhang, Kshitij Mehta, Andrew Blanchard,, Massimiliano Lupo Pasini

TL;DR
This paper presents a scalable approach to training graph convolutional neural networks on high-performance computing systems, enabling fast and accurate prediction of the HOMO-LUMO gap in molecules using large datasets and distributed training techniques.
Contribution
The work introduces HydraGNN and ADIOS frameworks for efficient large-scale GCNN training on HPC systems, achieving significant speedups and linear scaling across thousands of GPUs.
Findings
Data loading time reduced by up to 4.2 times
Achieved linear scaling up to 1,024 GPUs
Demonstrated effective prediction of HOMO-LUMO gaps in large molecular datasets
Abstract
Graph Convolutional Neural Network (GCNN) is a popular class of deep learning (DL) models in material science to predict material properties from the graph representation of molecular structures. Training an accurate and comprehensive GCNN surrogate for molecular design requires large-scale graph datasets and is usually a time-consuming process. Recent advances in GPUs and distributed computing open a path to reduce the computational cost for GCNN training effectively. However, efficient utilization of high performance computing (HPC) resources for training requires simultaneously optimizing large-scale data management and scalable stochastic batched optimization techniques. In this work, we focus on building GCNN models on HPC systems to predict material properties of millions of molecules. We use HydraGNN, our in-house library for large-scale GCNN training, leveraging distributed data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Advanced Graph Neural Networks · Computational Drug Discovery Methods
MethodsLib
