Improving Neural Network Training in Low Dimensional Random Bases

Frithjof Gressmann; Zach Eaton-Rosen; Carlo Luschi

arXiv:2011.04720·cs.LG·November 11, 2020·6 cites

Improving Neural Network Training in Low Dimensional Random Bases

Frithjof Gressmann, Zach Eaton-Rosen, Carlo Luschi

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper enhances the efficiency of training deep neural networks in low-dimensional random subspaces by dynamically updating projections and applying independent projections to network parts, leading to faster and more scalable optimization.

Contribution

It introduces a method of re-drawing random subspaces at each training step and applying independent projections to network parts, improving optimization performance and scalability.

Findings

01

Re-drawing random subspaces each step improves training performance.

02

Applying independent projections to network parts enhances efficiency.

03

On-demand pseudo-random projections reduce memory and increase speed.

Abstract

Stochastic Gradient Descent (SGD) has proven to be remarkably effective in optimizing deep neural networks that employ ever-larger numbers of parameters. Yet, improving the efficiency of large-scale optimization remains a vital and highly active area of research. Recent work has shown that deep neural networks can be optimized in randomly-projected subspaces of much smaller dimensionality than their native parameter space. While such training is promising for more efficient and scalable optimization schemes, its practical application is limited by inferior optimization performance. Here, we improve on recent random subspace approaches as follows: Firstly, we show that keeping the random projection fixed throughout training is detrimental to optimization. We propose re-drawing the random subspace at each step, which yields significantly better performance. We realize further improvements…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

graphcore-research/random-bases
tfOfficial

Videos

Improving Neural Network Training in Low Dimensional Random Bases· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Machine Learning and ELM