Benchmark Assessment for DeepSpeed Optimization Library

Gongbo Liang; Izzat Alsmadi

arXiv:2202.12831·cs.LG·February 28, 2022·1 cites

Benchmark Assessment for DeepSpeed Optimization Library

Gongbo Liang, Izzat Alsmadi

PDF

Open Access

TL;DR

This paper evaluates the Microsoft DeepSpeed library's impact on training efficiency across various neural network architectures, revealing mixed results with some improvements and some negative effects.

Contribution

It extends prior assessments of DeepSpeed by evaluating its performance on modern neural networks like CNNs and ViT, providing a broader understanding of its effectiveness.

Findings

01

DeepSpeed improves training efficiency in some architectures.

02

In certain cases, DeepSpeed has no effect or negative impact.

03

Evaluation covers modern neural network architectures beyond LeNet.

Abstract

Deep Learning (DL) models are widely used in machine learning due to their performance and ability to deal with large datasets while producing high accuracy and performance metrics. The size of such datasets and the complexity of DL models cause such models to be complex, consuming large amount of resources and time to train. Many recent libraries and applications are introduced to deal with DL complexity and efficiency issues. In this paper, we evaluated one example, Microsoft DeepSpeed library through classification tasks. DeepSpeed public sources reported classification performance metrics on the LeNet architecture. We extended this through evaluating the library on several modern neural network architectures, including convolutional neural networks (CNNs) and Vision Transformer (ViT). Results indicated that DeepSpeed, while can make improvements in some of those cases, it has no or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification

MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Residual Connection · Layer Normalization · Label Smoothing · Dropout · Dense Connections