Large Scale Artificial Neural Network Training Using Multi-GPUs
Linnan Wang, Wei Wu, Jianxiong Xiao, Yang Yi

TL;DR
This paper presents a method to accelerate large-scale neural network training by leveraging multi-GPU matrix multiplication, achieving linear speedup across multiple GPUs.
Contribution
It introduces an out-of-core multi-GPU matrix multiplication algorithm integrated with neural network training, enabling efficient large-scale ANN training.
Findings
Achieves linear speedup with multiple GPUs
Effective out-of-core matrix multiplication algorithm
Improves training efficiency for large neural networks
Abstract
This paper describes a method for accelerating large scale Artificial Neural Networks (ANN) training using multi-GPUs by reducing the forward and backward passes to matrix multiplication. We propose an out-of-core multi-GPU matrix multiplication and integrate the algorithm with the ANN training. The experiments demonstrate that our matrix multiplication algorithm achieves linear speedup on multiple inhomogeneous GPUs. The full paper of this project can be found at [1].
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques · Brain Tumor Detection and Classification
