MATCH: Model-Aware TVM-based Compilation for Heterogeneous Edge Devices

Mohamed Amine Hamdi; Francesco Daghero; Giuseppe Maria Sarda; Josse; Van Delm; Arne Symons; Luca Benini; Marian Verhelst; Daniele Jahier Pagliari,; Alessio Burrello

arXiv:2410.08855·cs.DC·October 14, 2024

MATCH: Model-Aware TVM-based Compilation for Heterogeneous Edge Devices

Mohamed Amine Hamdi, Francesco Daghero, Giuseppe Maria Sarda, Josse, Van Delm, Arne Symons, Luca Benini, Marian Verhelst, Daniele Jahier Pagliari,, Alessio Burrello

PDF

Open Access 1 Repo

TL;DR

MATCH is a flexible TVM-based framework that enables efficient deployment of DNNs on heterogeneous edge devices by leveraging customizable hardware abstractions, outperforming specialized toolchains in latency reduction.

Contribution

The paper introduces MATCH, a novel retargetable DNN deployment framework that simplifies cross-platform optimization for heterogeneous MCUs using hardware-aware models.

Findings

01

MATCH reduces inference latency up to 60.88 times on DIANA.

02

MATCH outperforms custom toolchains like HTVM and DORY in latency.

03

Hardware-aware modeling enables competitive performance with less re-engineering.

Abstract

Streamlining the deployment of Deep Neural Networks (DNNs) on heterogeneous edge platforms, coupling within the same micro-controller unit (MCU) instruction processors and hardware accelerators for tensor computations, is becoming one of the crucial challenges of the TinyML field. The best-performing DNN compilation toolchains are usually deeply customized for a single MCU family, and porting to a different heterogeneous MCU family implies labor-intensive re-development of almost the entire compiler. On the opposite side, retargetable toolchains, such as TVM, fail to exploit the capabilities of custom accelerators, resulting in the generation of general but unoptimized code. To overcome this duality, we introduce MATCH, a novel TVM-based DNN deployment framework designed for easy agile retargeting across different MCU processors and accelerators, thanks to a customizable model-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

eml-eda/match
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Simulation Techniques and Applications · Embedded Systems Design Techniques