Gated Compression Layers for Efficient Always-On Models
Haiguang Li, Trausti Thormundsson, Ivan Poupyrev, Nicholas Gillian

TL;DR
This paper introduces Gated Compression layers that transform neural networks into Gated Neural Networks, significantly reducing power consumption and improving accuracy for on-device machine learning applications.
Contribution
The paper presents a novel Gated Compression layer enabling existing neural networks to become Gated Neural Networks optimized for low-power, heterogeneous hardware environments.
Findings
Stops up to 96% of negative samples
Compresses 97% of positive samples
Maintains or improves model accuracy
Abstract
Mobile and embedded machine learning developers frequently have to compromise between two inferior on-device deployment strategies: sacrifice accuracy and aggressively shrink their models to run on dedicated low-power cores; or sacrifice battery by running larger models on more powerful compute cores such as neural processing units or the main application processor. In this paper, we propose a novel Gated Compression layer that can be applied to transform existing neural network architectures into Gated Neural Networks. Gated Neural Networks have multiple properties that excel for on-device use cases that help significantly reduce power, boost accuracy, and take advantage of heterogeneous compute cores. We provide results across five public image and audio datasets that demonstrate the proposed Gated Compression layer effectively stops up to 96% of negative samples, compresses 97% of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Anomaly Detection Techniques and Applications · Parallel Computing and Optimization Techniques
