Addressing the Memory Bottleneck in AI Model Training
David Ojika, Bhavesh Patel, G. Anthony Reina, Trent Boyer, Chad, Martin, Prashant Shah

TL;DR
This paper demonstrates training large memory-intensive AI models, including a 1 TB deep neural network, on a single server using Intel-optimized TensorFlow and high-memory x86 hardware, addressing the memory bottleneck.
Contribution
It presents the first successful training of a large memory footprint deep neural network (~1 TB) on a single-node server using optimized hardware and software.
Findings
Successful training of a ~1 TB neural network on a single server
Use of Intel-optimized TensorFlow enables large model training
Hardware configuration effectively addresses memory bottleneck
Abstract
Using medical imaging as case-study, we demonstrate how Intel-optimized TensorFlow on an x86-based server equipped with 2nd Generation Intel Xeon Scalable Processors with large system memory allows for the training of memory-intensive AI/deep-learning models in a scale-up server configuration. We believe our work represents the first training of a deep neural network having large memory footprint (~ 1 TB) on a single-node server. We recommend this configuration to scientists and researchers who wish to develop large, state-of-the-art AI models but are currently limited by memory.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Radiomics and Machine Learning in Medical Imaging
