Learning with Local Gradients at the Edge

Michael Lomnitz; Zachary Daniels; David Zhang; Michael Piacentino

arXiv:2208.08503·cs.LG·September 19, 2022

Learning with Local Gradients at the Edge

Michael Lomnitz, Zachary Daniels, David Zhang, Michael Piacentino

PDF

Open Access

TL;DR

This paper introduces tpSGD, a novel backpropagation-free training algorithm for neural networks that reduces memory usage and enables efficient learning on edge devices, achieving comparable accuracy to traditional methods.

Contribution

The paper presents tpSGD, a new gradient-free optimization method that generalizes target projection for training various neural network architectures with minimal memory.

Findings

01

tpSGD performs within 5% accuracy of backpropagation on shallow networks.

02

Outperforms other gradient-free algorithms in accuracy and efficiency.

03

Enables training of deep networks like VGG with reduced memory requirements.

Abstract

To enable learning on edge devices with fast convergence and low memory, we present a novel backpropagation-free optimization algorithm dubbed Target Projection Stochastic Gradient Descent (tpSGD). tpSGD generalizes direct random target projection to work with arbitrary loss functions and extends target projection for training recurrent neural networks (RNNs) in addition to feedforward networks. tpSGD uses layer-wise stochastic gradient descent (SGD) and local targets generated via random projections of the labels to train the network layer-by-layer with only forward passes. tpSGD doesn't require retaining gradients during optimization, greatly reducing memory allocation compared to SGD backpropagation (BP) methods that require multiple instances of the entire neural network weights, input/output, and intermediate results. Our method performs comparably to BP gradient-descent within 5%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Stochastic Gradient Optimization Techniques · Neural Networks and Applications

MethodsStochastic Gradient Descent