Training of deep residual networks with stochastic MG/OPT

Cyrill von Planta; Alena Kopanicakova; Rolf Krause

arXiv:2108.04052·cs.LG·August 10, 2021

Training of deep residual networks with stochastic MG/OPT

Cyrill von Planta, Alena Kopanicakova, Rolf Krause

PDF

Open Access 1 Repo

TL;DR

This paper introduces a stochastic multigrid method for training deep residual networks, achieving faster training and robustness improvements, with potential for network pruning.

Contribution

It presents a novel stochastic MG/OPT approach tailored for residual networks, leveraging dynamical systems theory for multilevel hierarchy construction.

Findings

01

Significant speed-ups in training deep residual networks.

02

Enhanced robustness during training.

03

Multilevel training acts as an effective pruning technique.

Abstract

We train deep residual networks with a stochastic variant of the nonlinear multigrid method MG/OPT. To build the multilevel hierarchy, we use the dynamical systems viewpoint specific to residual networks. We report significant speed-ups and additional robustness for training MNIST on deep residual networks. Our numerical experiments also indicate that multilevel training can be used as a pruning technique, as many of the auxiliary networks have accuracies comparable to the original network.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

eulerinstitute/mgopt_icml21
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis

MethodsPruning