Latency-aware Unified Dynamic Networks for Efficient Image Recognition
Yizeng Han, Zeyu Liu, Zhihang Yuan, Yifan Pu, Chaofei Wang, Shiji, Song, Gao Huang

TL;DR
LAUDNet is a unified framework for dynamic image recognition models that optimizes latency through integrated paradigms and scheduling, significantly reducing inference time while maintaining accuracy.
Contribution
This paper introduces LAUDNet, a comprehensive framework combining multiple dynamic paradigms with scheduling optimization guided by a latency predictor, bridging the gap between theoretical and practical efficiency.
Findings
LAUDNet reduces ResNet-101 latency by over 50% on multiple GPUs.
The framework effectively balances accuracy and inference efficiency.
It demonstrates practical latency improvements in real-world vision tasks.
Abstract
Dynamic computation has emerged as a promising avenue to enhance the inference efficiency of deep networks. It allows selective activation of computational units, leading to a reduction in unnecessary computations for each input sample. However, the actual efficiency of these dynamic models can deviate from theoretical predictions. This mismatch arises from: 1) the lack of a unified approach due to fragmented research; 2) the focus on algorithm design over critical scheduling strategies, especially in CUDA-enabled GPU contexts; and 3) challenges in measuring practical latency, given that most libraries cater to static operations. Addressing these issues, we unveil the Latency-Aware Unified Dynamic Networks (LAUDNet), a framework that integrates three primary dynamic paradigms-spatially adaptive computation, dynamic layer skipping, and dynamic channel skipping. To bridge the theoretical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices
MethodsFocus
