Optimizing DNN Inference on Multi-Accelerator SoCs at Training-time

Matteo Risso; Alessio Burrello; Daniele Jahier Pagliari

arXiv:2409.18566·cs.LG·February 24, 2025

Optimizing DNN Inference on Multi-Accelerator SoCs at Training-time

Matteo Risso, Alessio Burrello, Daniele Jahier Pagliari

PDF

Open Access

TL;DR

This paper introduces ODiMO, a hardware-aware training-time tool for optimally mapping DNNs onto multi-accelerator SoCs, significantly improving latency and energy efficiency while maintaining accuracy.

Contribution

ODiMO is the first to explore fine-grain, training-time mapping of DNN layers across heterogeneous accelerators to optimize energy and latency with accuracy considerations.

Findings

01

Up to 8x latency reduction at iso-accuracy.

02

Up to 50.8x energy efficiency improvements.

03

Minimal accuracy loss (<0.3%) with optimized mappings.

Abstract

The demand for executing Deep Neural Networks (DNNs) with low latency and minimal power consumption at the edge has led to the development of advanced heterogeneous Systems-on-Chips (SoCs) that incorporate multiple specialized computing units (CUs), such as accelerators. Offloading DNN computations to a specific CU from the available set often exposes accuracy vs efficiency trade-offs, due to differences in their supported operations (e.g., standard vs. depthwise convolution) or data representations (e.g., more/less aggressively quantized). A challenging yet unresolved issue is how to map a DNN onto these multi-CU systems to maximally exploit the parallelization possibilities while taking accuracy into account. To address this problem, we present ODiMO, a hardware-aware tool that efficiently explores fine-grain mapping of DNNs among various on-chip CUs, during the training phase. ODiMO…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNuclear Physics and Applications · Particle Detector Development and Performance · Radiation Effects in Electronics

MethodsSparse Evolutionary Training