Convolutions Predictable Offloading to an Accelerator: Formalization and Optimization
Benjamin Husson (C-S Group), Mohammed Belca\"id (C-S Group), Thomas Carle (IRIT-TRACES), Claire Pagetti

TL;DR
This paper formalizes the process of offloading convolutional neural network computations to accelerators with limited memory, optimizing the sequence of data transfers for improved performance.
Contribution
It introduces a formal framework for predicting and optimizing convolution offloading strategies to accelerators with constrained memory resources.
Findings
Formalization of convolution offloading sequences
Development of a Python-based simulator for strategy analysis
Identification of optimal offloading strategies under constraints
Abstract
Convolutional neural networks (CNNs) require a large number of multiply-accumulate (MAC) operations. To meet real-time constraints, they often need to be executed on specialized accelerators composed of an on-chip memory and a processing unit. However, the on-chip memory is often insufficient to store all the data required to compute a CNN layer. Thus, the computation must be performed in several offloading steps. We formalise such sequences of steps and apply our formalism to a state of the art decomposition of convolutions. In order to find optimal strategies in terms of duration, we encode the problem with a set of constraints. A Python-based simulator allows to analyse in-depth computed strategies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Big Data and Digital Economy · Adversarial Robustness in Machine Learning
