SplitBrain: Hybrid Data and Model Parallel Deep Learning
Farley Lai, Asim Kadav, Erik Kruus

TL;DR
SplitBrain is a high-performance distributed deep learning framework that combines data and model parallelism with layer-specific partitioning and scalable communication, enabling efficient training of large models with reduced memory use.
Contribution
It introduces a novel hybrid parallelism approach with layer-specific partitioning and scalable communication to improve training efficiency of large deep learning models.
Findings
Achieves nearly linear speedup in training.
Reduces memory consumption by up to 67%.
Improves training throughput with scalable communication.
Abstract
The recent success of deep learning applications has coincided with those widely available powerful computational resources for training sophisticated machine learning models with huge datasets. Nonetheless, training large models such as convolutional neural networks using model parallelism (as opposed to data parallelism) is challenging because the complex nature of communication between model shards makes it difficult to partition the computation efficiently across multiple machines with an acceptable trade-off. This paper presents SplitBrain, a high performance distributed deep learning framework supporting hybrid data and model parallelism. Specifically, SplitBrain provides layer-specific partitioning that co-locates compute intensive convolutional layers while sharding memory demanding layers. A novel scalable group communication is proposed to further improve the training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Advanced Graph Neural Networks
MethodsConvolution · Softmax · Max Pooling · Dense Connections · Dropout
