Leveraging the HW/SW Optimizations and Ecosystems that Drive the AI   Revolution

Humberto Carvalho; Pavel Zaykov; Asim Ukaye

arXiv:2208.02808·cs.LG·August 5, 2022

Leveraging the HW/SW Optimizations and Ecosystems that Drive the AI Revolution

Humberto Carvalho, Pavel Zaykov, Asim Ukaye

PDF

Open Access

TL;DR

This paper reviews hardware and software optimizations for Deep Neural Networks, focusing on GPU enhancements and demonstrating improvements on an edge AI platform with a state-of-the-art optical flow network.

Contribution

It introduces two types of DNN optimizations, one requiring retraining and one without, applicable across AI inference platforms, with practical demonstration on Nvidia Jetson AGX Xavier.

Findings

01

Enhanced RAFT optical flow network performance

02

Optimizations applicable to various AI inference platforms

03

Demonstrated improvements on Nvidia Jetson AGX Xavier

Abstract

This paper presents a state-of-the-art overview on how to architect, design, and optimize Deep Neural Networks (DNNs) such that performance is improved and accuracy is preserved. The paper covers a set of optimizations that span the entire Machine Learning processing pipeline. We introduce two types of optimizations. The first alters the DNN model and requires NN re-training, while the second does not. We focus on GPU optimizations, but we believe the presented techniques can be used with other AI inference platforms. To demonstrate the DNN model optimizations, we improve one of the most advanced deep network architectures for optical flow, RAFT arXiv:2003.12039, on a popular edge AI inference platform (Nvidia Jetson AGX Xavier).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Reservoir Computing