Multi-Component Optimization and Efficient Deployment of Neural-Networks   on Resource-Constrained IoT Hardware

Bharath Sudharsan; Dineshkumar Sundaram; Pankesh Patel; John G.; Breslin; Muhammad Intizar Ali; Schahram Dustdar; Albert Zomaya; Rajiv Ranjan

arXiv:2204.10183·cs.LG·April 22, 2022·1 cites

Multi-Component Optimization and Efficient Deployment of Neural-Networks on Resource-Constrained IoT Hardware

Bharath Sudharsan, Dineshkumar Sundaram, Pankesh Patel, John G., Breslin, Muhammad Intizar Ali, Schahram Dustdar, Albert Zomaya, Rajiv Ranjan

PDF

Open Access 1 Repo

TL;DR

This paper introduces a comprehensive multi-component model optimization process that enables high-performance, low-power AI models to run efficiently on resource-limited IoT devices, enhancing offline analytics capabilities.

Contribution

The paper presents a novel, open-source optimization sequence for compressing and accelerating neural networks specifically for resource-constrained IoT hardware.

Findings

01

Models compressed by 12.06x in size

02

Achieved 0.13% to 0.27% accuracy improvement

03

Inference time reduced to 0.06 ms per unit

Abstract

The majority of IoT devices like smartwatches, smart plugs, HVAC controllers, etc., are powered by hardware with a constrained specification (low memory, clock speed and processor) which is insufficient to accommodate and execute large, high-quality models. On such resource-constrained devices, manufacturers still manage to provide attractive functionalities (to boost sales) by following the traditional approach of programming IoT devices/products to collect and transmit data (image, audio, sensor readings, etc.) to their cloud-based ML analytics platforms. For decades, this online approach has been facing issues such as compromised data streams, non-real-time analytics due to latency, bandwidth constraints, costly subscriptions, recent privacy issues raised by users and the GDPR guidelines, etc. In this paper, to enable ultra-fast and accurate AI-based offline analytics on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bharathsudharsan/cnn_on_mcu
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · CCD and CMOS Imaging Sensors

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings