Third ArchEdge Workshop: Exploring the Design Space of Efficient Deep   Neural Networks

Fuxun Yu; Dimitrios Stamoulis; Di Wang; Dimitrios Lymberopoulos; Xiang; Chen

arXiv:2011.10912·cs.AR·November 24, 2020

Third ArchEdge Workshop: Exploring the Design Space of Efficient Deep Neural Networks

Fuxun Yu, Dimitrios Stamoulis, Di Wang, Dimitrios Lymberopoulos, Xiang, Chen

PDF

Open Access

TL;DR

This paper explores the design space of efficient deep neural networks by combining static architecture profiling at the GPU core level with dynamic runtime traversal of feature map redundancy, aiming to improve accuracy-latency trade-offs.

Contribution

It introduces a novel full-stack GPU profiling approach for static architecture optimization and a new dynamic method for exploiting feature map redundancy during model execution.

Findings

01

Full-stack GPU profiling reveals better accuracy-latency trade-offs.

02

Dynamic feature map traversal improves runtime efficiency.

03

Highlights open research questions in DNN efficiency.

Abstract

This paper gives an overview of our ongoing work on the design space exploration of efficient deep neural networks (DNNs). Specifically, we cover two aspects: (1) static architecture design efficiency and (2) dynamic model execution efficiency. For static architecture design, different from existing end-to-end hardware modeling assumptions, we conduct full-stack profiling at the GPU core level to identify better accuracy-latency trade-offs for DNN designs. For dynamic model execution, different from prior work that tackles model redundancy at the DNN-channels level, we explore a new dimension of DNN feature map redundancy to be dynamically traversed at runtime. Last, we highlight several open questions that are poised to draw research attention in the next few years.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Parallel Computing and Optimization Techniques