Fused Depthwise Tiling for Memory Optimization in TinyML Deep Neural Network Inference
Rafael Stahl, Daniel Mueller-Gritschneder, Ulf Schlichtmann

TL;DR
This paper introduces Fused Depthwise Tiling (FDT), a novel memory optimization technique for TinyML DNN inference that reduces memory usage across various network layers without runtime overhead.
Contribution
FDT extends tiling methods to more network layers, significantly reduces memory in TinyML DNNs, and includes an automated flow for optimal tiling configuration.
Findings
Achieved 76.2% and 18.1% memory reduction on two models.
Provided alternative design points with no runtime overhead.
Enabled memory optimization where previous methods failed.
Abstract
Memory optimization for deep neural network (DNN) inference gains high relevance with the emergence of TinyML, which refers to the deployment of DNN inference tasks on tiny, low-power microcontrollers. Applications such as audio keyword detection or radar-based gesture recognition are heavily constrained by the limited memory on such tiny devices because DNN inference requires large intermediate run-time buffers to store activations and other intermediate data, which leads to high memory usage. In this paper, we propose a new Fused Depthwise Tiling (FDT) method for the memory optimization of DNNs, which, compared to existing tiling methods, reduces memory usage without inducing any run time overhead. FDT applies to a larger variety of network layers than existing tiling methods that focus on convolutions. It improves TinyML memory optimization significantly by reducing memory of models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Handwritten Text Recognition Techniques
