How to Reach Real-Time AI on Consumer Devices? Solutions for Programmable and Custom Architectures
Stylianos I. Venieris, Ioannis Panopoulos, Ilias Leontiadis and, Iakovos S. Venieris

TL;DR
This paper explores cross-stack design techniques to enable real-time AI inference on consumer devices, addressing hardware heterogeneity, computational costs, and accuracy challenges for both programmable processors and custom accelerators.
Contribution
It presents diverse model-, system-, and hardware-level solutions for efficient AI deployment, emphasizing the integration of these techniques for real-time performance on commodity devices.
Findings
AI systems can run efficiently without overburdening mobile hardware
Custom accelerators like ASICs and FPGAs enable next-generation AI applications
Cross-stack solutions improve inference accuracy while maintaining real-time performance
Abstract
The unprecedented performance of deep neural networks (DNNs) has led to large strides in various Artificial Intelligence (AI) inference tasks, such as object and speech recognition. Nevertheless, deploying such AI models across commodity devices faces significant challenges: large computational cost, multiple performance objectives, hardware heterogeneity and a common need for high accuracy, together pose critical problems to the deployment of DNNs across the various embedded and mobile devices in the wild. As such, we have yet to witness the mainstream usage of state-of-the-art deep learning algorithms across consumer devices. In this paper, we provide preliminary answers to this potentially game-changing question by presenting an array of design techniques for efficient AI systems. We start by examining the major roadblocks when targeting both programmable processors and custom…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
