NetCut: Real-Time DNN Inference Using Layer Removal

Mehrshad Zandigohar; Deniz Erdogmus; Gunar Schirner

arXiv:2101.05363·cs.LG·August 12, 2021

NetCut: Real-Time DNN Inference Using Layer Removal

Mehrshad Zandigohar, Deniz Erdogmus, Gunar Schirner

PDF

TL;DR

This paper introduces NetCut, a method for constructing and selecting trimmed neural networks through layer removal to meet real-time inference deadlines while improving accuracy and reducing exploration time.

Contribution

It proposes layer removal for creating transfer learning-based trimmed networks and a methodology for efficiently selecting networks that meet specific latency constraints.

Findings

01

TRNs can expand the latency-accuracy Pareto frontier.

02

NetCut achieves up to 10.43% accuracy improvement under deadlines.

03

27x reduction in network exploration time.

Abstract

Deep Learning plays a significant role in assisting humans in many aspects of their lives. As these networks tend to get deeper over time, they extract more features to increase accuracy at the cost of additional inference latency. This accuracy-performance trade-off makes it more challenging for Embedded Systems, as resource-constrained processors with strict deadlines, to deploy them efficiently. This can lead to selection of networks that can prematurely meet a specified deadline with excess slack time that could have potentially contributed to increased accuracy. In this work, we propose: (i) the concept of layer removal as a means of constructing TRimmed Networks (TRNs) that are based on removing problem-specific features of a pretrained network used in transfer learning, and (ii) NetCut, a methodology based on an empirical or an analytical latency estimator, which only proposes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.