Gated Compression Layers for Efficient Always-On Models

Haiguang Li; Trausti Thormundsson; Ivan Poupyrev; Nicholas Gillian

arXiv:2303.08970·cs.LG·March 17, 2023·1 cites

Gated Compression Layers for Efficient Always-On Models

Haiguang Li, Trausti Thormundsson, Ivan Poupyrev, Nicholas Gillian

PDF

Open Access

TL;DR

This paper introduces Gated Compression layers that transform neural networks into Gated Neural Networks, significantly reducing power consumption and improving accuracy for on-device machine learning applications.

Contribution

The paper presents a novel Gated Compression layer enabling existing neural networks to become Gated Neural Networks optimized for low-power, heterogeneous hardware environments.

Findings

01

Stops up to 96% of negative samples

02

Compresses 97% of positive samples

03

Maintains or improves model accuracy

Abstract

Mobile and embedded machine learning developers frequently have to compromise between two inferior on-device deployment strategies: sacrifice accuracy and aggressively shrink their models to run on dedicated low-power cores; or sacrifice battery by running larger models on more powerful compute cores such as neural processing units or the main application processor. In this paper, we propose a novel Gated Compression layer that can be applied to transform existing neural network architectures into Gated Neural Networks. Gated Neural Networks have multiple properties that excel for on-device use cases that help significantly reduce power, boost accuracy, and take advantage of heterogeneous compute cores. We provide results across five public image and audio datasets that demonstrate the proposed Gated Compression layer effectively stops up to 96% of negative samples, compresses 97% of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Anomaly Detection Techniques and Applications · Parallel Computing and Optimization Techniques