Distributed Machine Learning for Computational Engineering using MPI
Kailai Xu, Weiqiang Zhu, Eric Darve

TL;DR
This paper introduces a parallel computing framework combining neural network training with PDE solvers, enabling efficient large-scale simulations by parallelizing both components and separating data communication from computation.
Contribution
It presents a novel framework that parallelizes both neural networks and PDE solvers, improving flexibility and scalability in computational engineering tasks.
Findings
Achieved substantial acceleration in training coupled neural networks and PDEs.
Demonstrated effectiveness on various large-scale problems.
Separated data communication from computation for better modularity.
Abstract
We propose a framework for training neural networks that are coupled with partial differential equations (PDEs) in a parallel computing environment. Unlike most distributed computing frameworks for deep neural networks, our focus is to parallelize both numerical solvers and deep neural networks in forward and adjoint computations. Our parallel computing model views data communication as a node in the computational graph for numerical simulations. The advantage of our model is that data communication and computing are cleanly separated and thus provide better flexibility, modularity, and testability. We demonstrate using various large-scale problems that we can achieve substantial acceleration by using parallel solvers for PDEs in training deep neural networks that are coupled with PDEs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReservoir Engineering and Simulation Methods · Model Reduction and Neural Networks · Neural Networks and Applications
