TL;DR
HTVM is a compiler that enhances neural network deployment on heterogeneous TinyML hardware, achieving significant performance improvements by optimizing utilization of diverse accelerators and reducing data movement.
Contribution
The paper introduces HTVM, a novel compiler that combines TVM and DORY to optimize neural network deployment on heterogeneous TinyML platforms.
Findings
120x performance improvement over standard TVM deployment
Successful deployment of MLPerf Tiny suite on DIANA SoC
Efficient utilization of heterogeneous accelerators
Abstract
Optimal deployment of deep neural networks (DNNs) on state-of-the-art Systems-on-Chips (SoCs) is crucial for tiny machine learning (TinyML) at the edge. The complexity of these SoCs makes deployment non-trivial, as they typically contain multiple heterogeneous compute cores with limited, programmer-managed memory to optimize latency and energy efficiency. We propose HTVM - a compiler that merges TVM with DORY to maximize the utilization of heterogeneous accelerators and minimize data movements. HTVM allows deploying the MLPerf(TM) Tiny suite on DIANA, an SoC with a RISC-V CPU, and digital and analog compute-in-memory AI accelerators, at 120x improved performance over plain TVM deployment.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
