Multi-Component Optimization and Efficient Deployment of Neural-Networks on Resource-Constrained IoT Hardware
Bharath Sudharsan, Dineshkumar Sundaram, Pankesh Patel, John G., Breslin, Muhammad Intizar Ali, Schahram Dustdar, Albert Zomaya, Rajiv Ranjan

TL;DR
This paper introduces a comprehensive multi-component model optimization process that enables high-performance, low-power AI models to run efficiently on resource-limited IoT devices, enhancing offline analytics capabilities.
Contribution
The paper presents a novel, open-source optimization sequence for compressing and accelerating neural networks specifically for resource-constrained IoT hardware.
Findings
Models compressed by 12.06x in size
Achieved 0.13% to 0.27% accuracy improvement
Inference time reduced to 0.06 ms per unit
Abstract
The majority of IoT devices like smartwatches, smart plugs, HVAC controllers, etc., are powered by hardware with a constrained specification (low memory, clock speed and processor) which is insufficient to accommodate and execute large, high-quality models. On such resource-constrained devices, manufacturers still manage to provide attractive functionalities (to boost sales) by following the traditional approach of programming IoT devices/products to collect and transmit data (image, audio, sensor readings, etc.) to their cloud-based ML analytics platforms. For decades, this online approach has been facing issues such as compromised data streams, non-real-time analytics due to latency, bandwidth constraints, costly subscriptions, recent privacy issues raised by users and the GDPR guidelines, etc. In this paper, to enable ultra-fast and accurate AI-based offline analytics on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · CCD and CMOS Imaging Sensors
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
