Feed-Forward On-Edge Fine-tuning Using Static Synthetic Gradient Modules

Robby Neven; Marian Verhelst; Tinne Tuytelaars; Toon Goedem\'e

arXiv:2009.09675·cs.CV·September 22, 2020

Feed-Forward On-Edge Fine-tuning Using Static Synthetic Gradient Modules

Robby Neven, Marian Verhelst, Tinne Tuytelaars, Toon Goedem\'e

PDF

TL;DR

This paper introduces static Synthetic Gradient Modules that enable on-edge, memory-efficient training of deep models by predicting gradients during the forward pass, demonstrated on robotic grasping tasks.

Contribution

The work presents a novel method using static SGMs for memory-efficient, feed-forward training on embedded devices, reducing activation storage needs.

Findings

01

Comparable grasping success rates to standard backpropagation

02

Effective gradient prediction by static SGMs after meta-learning

03

Memory savings during training on embedded devices

Abstract

Training deep learning models on embedded devices is typically avoided since this requires more memory, computation and power over inference. In this work, we focus on lowering the amount of memory needed for storing all activations, which are required during the backward pass to compute the gradients. Instead, during the forward pass, static Synthetic Gradient Modules (SGMs) predict gradients for each layer. This allows training the model in a feed-forward manner without having to store all activations. We tested our method on a robot grasping scenario where a robot needs to learn to grasp new objects given only a single demonstration. By first training the SGMs in a meta-learning manner on a set of common objects, during fine-tuning, the SGMs provided the model with accurate gradients to successfully learn to grasp new objects. We have shown that our method has comparable results to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.