ALADIN: Accuracy-Latency-Aware Design-space Inference Analysis for Embedded AI Accelerators

T. Baldi; D. Casini; A. Biondi

arXiv:2603.08722·cs.AR·March 11, 2026

ALADIN: Accuracy-Latency-Aware Design-space Inference Analysis for Embedded AI Accelerators

T. Baldi, D. Casini, A. Biondi

PDF

Open Access

TL;DR

ALADIN is a framework that evaluates the trade-offs among accuracy, latency, and resource use in embedded AI accelerators for neural networks, without needing hardware deployment.

Contribution

It introduces a platform-aware analysis method for mixed-precision quantized neural networks, reducing development time and enabling detailed hardware-software co-design.

Findings

01

ALADIN accurately predicts inference bottlenecks and trade-offs.

02

Mixed-precision quantization impacts accuracy and latency significantly.

03

The framework facilitates rapid evaluation of architectural decisions.

Abstract

The inference of deep neural networks (DNNs) on resource-constrained embedded systems introduces non-trivial trade-offs among model accuracy, computational latency, and hardware limitations, particularly when real-time constraints must be satisfied. This paper presents ALADIN, an accuracy-latency-aware design-space inference analysis framework for mixed-precision quantized neural networks (QNNs) targeting scratchpad-based AI accelerators. ALADIN enables the evaluation and analysis of inference bottlenecks and design trade-offs across accuracy, latency, and resource consumption without requiring deployment on the target platform, thereby significantly reducing development time and cost. The framework introduces a progressive refinement process that transforms a canonical QONNX model into platform-aware representations by integrating both platform-independent implementation details and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Embedded Systems Design Techniques · Adversarial Robustness in Machine Learning