Training DNNs in O(1) memory with MEM-DFA using Random Matrices

Tien Chu; Kamil Mykitiuk; Miron Szewczyk; Adam Wiktor; Zbigniew Wojna

arXiv:2012.11745·cs.CV·December 23, 2020·1 cites

Training DNNs in O(1) memory with MEM-DFA using Random Matrices

Tien Chu, Kamil Mykitiuk, Miron Szewczyk, Adam Wiktor, Zbigniew Wojna

PDF

Open Access

TL;DR

This paper introduces MEM-DFA, a training method for deep neural networks that reduces memory usage to a constant, enabling training of very deep models with minimal memory overhead by leveraging random matrices and biologically plausible feedback mechanisms.

Contribution

The paper proposes MEM-DFA, a novel memory-efficient training algorithm that maintains constant memory usage regardless of network depth, based on feedback alignment with random matrices.

Findings

01

MEM-DFA significantly reduces memory consumption compared to BP, FA, and DFA.

02

Experimental results on MNIST and CIFAR-10 validate theoretical memory savings.

03

MEM-DFA incurs only a small constant increase in computational cost.

Abstract

This work presents a method for reducing memory consumption to a constant complexity when training deep neural networks. The algorithm is based on the more biologically plausible alternatives of the backpropagation (BP): direct feedback alignment (DFA) and feedback alignment (FA), which use random matrices to propagate error. The proposed method, memory-efficient direct feedback alignment (MEM-DFA), uses higher independence of layers in DFA and allows avoiding storing at once all activation vectors, unlike standard BP, FA, and DFA. Thus, our algorithm's memory usage is constant regardless of the number of layers in a neural network. The method increases the computational cost only by a constant factor of one extra forward pass. The MEM-DFA, BP, FA, and DFA were evaluated along with their memory profiles on MNIST and CIFAR-10 datasets on various neural network models. Our experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and ELM · Advanced Memory and Neural Computing

MethodsFeedback Alignment · Direct Feedback Alignment