Comparative Study of Deep Learning Software Frameworks
Soheil Bahrampour, Naveen Ramakrishnan, Lukas Schott, Mohak Shah

TL;DR
This paper compares five deep learning frameworks across extensibility, hardware utilization, and speed, providing insights into their performance on various architectures and hardware setups.
Contribution
It offers a comprehensive empirical comparison of Caffe, Neon, TensorFlow, Theano, and Torch, highlighting their strengths and weaknesses in different deep learning tasks.
Findings
Torch is best for CPU-based deep architectures.
Theano and Torch excel in GPU performance for large networks.
TensorFlow is highly flexible but less performant currently.
Abstract
Deep learning methods have resulted in significant performance improvements in several application domains and as such several software frameworks have been developed to facilitate their implementation. This paper presents a comparative study of five deep learning frameworks, namely Caffe, Neon, TensorFlow, Theano, and Torch, on three aspects: extensibility, hardware utilization, and speed. The study is performed on several types of deep learning architectures and we evaluate the performance of the above frameworks when employed on a single machine for both (multi-threaded) CPU and GPU (Nvidia Titan X) settings. The speed performance metrics used here include the gradient computation time, which is important during the training phase of deep networks, and the forward time, which is important from the deployment perspective of trained networks. For convolutional networks, we also report…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Sigmoid Activation · Tanh Activation · Long Short-Term Memory
