DeepliteRT: Computer Vision at the Edge
Saad Ashfaq, Alexander Hoffman, Saptarshi Mitra, Sudhakar Sah,, MohammadHossein AskariHemmat, Ehsan Saboori

TL;DR
DeepliteRT enables efficient deployment of ultra low-bit quantized deep learning models on ARM edge devices, significantly reducing resource requirements and increasing inference speed for computer vision tasks.
Contribution
The paper introduces highly optimized ultra low-bit convolution operators and an end-to-end runtime system for deploying quantized models on ARM devices, outperforming existing methods.
Findings
Up to 4.34x faster convolution operators on ARM.
Speedups of up to 2.20x, 2.33x, and 2.17x over 32-bit, 8-bit, and 2-bit baselines.
Effective end-to-end deployment of ultra low-bit models on commodity hardware.
Abstract
The proliferation of edge devices has unlocked unprecedented opportunities for deep learning model deployment in computer vision applications. However, these complex models require considerable power, memory and compute resources that are typically not available on edge platforms. Ultra low-bit quantization presents an attractive solution to this problem by scaling down the model weights and activations from 32-bit to less than 8-bit. We implement highly optimized ultra low-bit convolution operators for ARM-based targets that outperform existing methods by up to 4.34x. Our operator is implemented within Deeplite Runtime (DeepliteRT), an end-to-end solution for the compilation, tuning, and inference of ultra low-bit models on ARM devices. Compiler passes in DeepliteRT automatically convert a fake-quantized model in full precision to a compact ultra low-bit representation, easing the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Advanced Image and Video Retrieval Techniques
MethodsConvolution
