ALADIN: Accuracy-Latency-Aware Design-space Inference Analysis for Embedded AI Accelerators
T. Baldi, D. Casini, A. Biondi

TL;DR
ALADIN is a framework that evaluates the trade-offs among accuracy, latency, and resource use in embedded AI accelerators for neural networks, without needing hardware deployment.
Contribution
It introduces a platform-aware analysis method for mixed-precision quantized neural networks, reducing development time and enabling detailed hardware-software co-design.
Findings
ALADIN accurately predicts inference bottlenecks and trade-offs.
Mixed-precision quantization impacts accuracy and latency significantly.
The framework facilitates rapid evaluation of architectural decisions.
Abstract
The inference of deep neural networks (DNNs) on resource-constrained embedded systems introduces non-trivial trade-offs among model accuracy, computational latency, and hardware limitations, particularly when real-time constraints must be satisfied. This paper presents ALADIN, an accuracy-latency-aware design-space inference analysis framework for mixed-precision quantized neural networks (QNNs) targeting scratchpad-based AI accelerators. ALADIN enables the evaluation and analysis of inference bottlenecks and design trade-offs across accuracy, latency, and resource consumption without requiring deployment on the target platform, thereby significantly reducing development time and cost. The framework introduces a progressive refinement process that transforms a canonical QONNX model into platform-aware representations by integrating both platform-independent implementation details and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Embedded Systems Design Techniques · Adversarial Robustness in Machine Learning
