Multi Part Deployment of Neural Network
Paritosh Ranjan, Surajit Majumder, Prodip Roy

TL;DR
This paper introduces a distributed architecture for training large neural networks by partitioning them across multiple servers, enabling scalable and cost-effective deployment on cloud infrastructure.
Contribution
It proposes a novel multi-part neural network execution engine and partitioning strategy to efficiently distribute large models across multiple servers.
Findings
Enables training of neural networks with billions of neurons across multiple servers.
Reduces infrastructure costs by avoiding monolithic GPU clusters.
Maintains model consistency during parallel updates.
Abstract
The increasing scale of modern neural networks, exemplified by architectures from IBM (530 billion neurons) and Google (500 billion parameters), presents significant challenges in terms of computational cost and infrastructure requirements. As deep neural networks continue to grow, traditional training paradigms relying on monolithic GPU clusters become increasingly unsustainable. This paper proposes a distributed system architecture that partitions a neural network across multiple servers, each responsible for a subset of neurons. Neurons are classified as local or remote, with inter-server connections managed via a metadata-driven lookup mechanism. A Multi-Part Neural Network Execution Engine facilitates seamless execution and training across distributed partitions by dynamically resolving and invoking remote neurons using stored metadata. All servers share a unified model through a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection · Manufacturing Process and Optimization
