Hardware architecture and routing-aware training for optimal memory   usage: a case study

Jimmy Weber; Theo Ballet; Melika Payvand

arXiv:2412.01575·cs.ET·December 3, 2024

Hardware architecture and routing-aware training for optimal memory usage: a case study

Jimmy Weber, Theo Ballet, Melika Payvand

PDF

Open Access

TL;DR

This paper presents a co-design training approach for neural networks that optimizes memory usage and routing on resource-limited hardware, demonstrated on a case study with improved accuracy and reduced memory requirements.

Contribution

It introduces a novel routing-aware training algorithm with a proxy-based mapping approximation, enabling efficient hardware-aware neural network deployment.

Findings

01

Achieved 5% higher accuracy with same parameters

02

Reduced memory usage by 10x compared to non-routing-aware methods

03

Networks are fully mappable to the target hardware architecture

Abstract

Efficient deployment of neural networks on resource-constrained hardware demands optimal use of on-chip memory. In event-based processors, this is particularly critical for routing architectures, where substantial memory is dedicated to managing network connectivity. While prior work has focused on optimizing event routing during hardware design, optimizing memory utilization for routing during network training remains underexplored. Key challenges include: (i) integrating routing into the loss function, which often introduces non-differentiability, and (ii) computational expense in evaluating network mappability to hardware. We propose a hardware-algorithm co-design approach to train routing-aware neural networks. To address challenge (i), we extend the DeepR training algorithm, leveraging dynamic pruning and random re-assignment to optimize memory use. For challenge (ii), we introduce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques