Training of deep residual networks with stochastic MG/OPT
Cyrill von Planta, Alena Kopanicakova, Rolf Krause

TL;DR
This paper introduces a stochastic multigrid method for training deep residual networks, achieving faster training and robustness improvements, with potential for network pruning.
Contribution
It presents a novel stochastic MG/OPT approach tailored for residual networks, leveraging dynamical systems theory for multilevel hierarchy construction.
Findings
Significant speed-ups in training deep residual networks.
Enhanced robustness during training.
Multilevel training acts as an effective pruning technique.
Abstract
We train deep residual networks with a stochastic variant of the nonlinear multigrid method MG/OPT. To build the multilevel hierarchy, we use the dynamical systems viewpoint specific to residual networks. We report significant speed-ups and additional robustness for training MNIST on deep residual networks. Our numerical experiments also indicate that multilevel training can be used as a pruning technique, as many of the auxiliary networks have accuracies comparable to the original network.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis
MethodsPruning
