Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural Networks for Edge Devices
Byung Hoon Ahn, Jinwon Lee, Jamie Menjay Lin, Hsin-Pai Cheng, Jilei, Hou, Hadi Esmaeilzadeh

TL;DR
This paper introduces SERENITY, a memory-aware compiler for irregularly wired neural networks that optimally schedules activations to minimize memory footprint on edge devices, outperforming existing solutions.
Contribution
SERENITY employs dynamic programming and graph rewriting to achieve optimal and improved memory scheduling for irregular neural networks on resource-constrained edge devices.
Findings
Achieves 1.68x peak memory reduction with dynamic programming.
Further improves memory footprint by 1.86x with graph rewriting.
Operates with less than one minute overhead compared to TensorFlow Lite.
Abstract
Recent advances demonstrate that irregularly wired neural networks from Neural Architecture Search (NAS) and Random Wiring can not only automate the design of deep neural networks but also emit models that outperform previous manual designs. These designs are especially effective while designing neural architectures under hard resource constraints (memory, MACs, . . . ) which highlights the importance of this class of designing neural networks. However, such a move creates complication in the previously streamlined pattern of execution. In fact one of the main challenges is that the order of such nodes in the neural network significantly effects the memory footprint of the intermediate activations. Current compilers do not schedule with regard to activation memory footprint that it significantly increases its peak compared to the optimum, rendering it not applicable for edge devices. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Parallel Computing and Optimization Techniques
MethodsSigmoid Activation · Tanh Activation · Softmax · Long Short-Term Memory
