Adaptive Resolution Inference (ARI): Energy-Efficient Machine Learning   for Internet of Things

Ziheng Wang; Pedro Reviriego; Farzad Niknia; Javier Conde; Shanshan; Liu; Fabrizio Lombardi

arXiv:2408.14528·cs.LG·August 28, 2024

Adaptive Resolution Inference (ARI): Energy-Efficient Machine Learning for Internet of Things

Ziheng Wang, Pedro Reviriego, Farzad Niknia, Javier Conde, Shanshan, Liu, Fabrizio Lombardi

PDF

TL;DR

The paper introduces Adaptive Resolution Inference (ARI), a method that reduces energy consumption in IoT machine learning by selectively running full or reduced precision models based on confidence margins, achieving significant energy savings.

Contribution

ARI is a novel adaptive inference approach that balances energy efficiency and model accuracy by dynamically choosing between quantized and full models based on confidence margins.

Findings

01

Energy savings between 40% and 85% across datasets.

02

Most inferences are performed with reduced precision, only few require full model.

03

Effective tradeoff between energy consumption and accuracy.

Abstract

The implementation of machine learning in Internet of Things devices poses significant operational challenges due to limited energy and computation resources. In recent years, significant efforts have been made to implement simplified ML models that can achieve reasonable performance while reducing computation and energy, for example by pruning weights in neural networks, or using reduced precision for the parameters and arithmetic operations. However, this type of approach is limited by the performance of the ML implementation, i.e., by the loss for example in accuracy due to the model simplification. In this article, we present adaptive resolution inference (ARI), a novel approach that enables to evaluate new tradeoffs between energy dissipation and model performance in ML implementations. The main principle of the proposed approach is to run inferences with reduced precision…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning