CNNs are Globally Optimal Given Multi-Layer Support
Chen Huang, Chen Kong, and Simon Lucey

TL;DR
This paper introduces a novel training strategy for CNNs that replaces ReLU with hard-thresholding, enabling a linear interpretation and faster convergence to global optima, achieving state-of-the-art results.
Contribution
It proposes an alternation training method leveraging a binary support interpretation of CNNs, leading to improved convergence and theoretical guarantees.
Findings
Faster training convergence compared to standard SGD.
Theoretical proof of global optimality under certain conditions.
State-of-the-art performance on ImageNet and benchmarks.
Abstract
Stochastic Gradient Descent (SGD) is the central workhorse for training modern CNNs. Although giving impressive empirical performance it can be slow to converge. In this paper we explore a novel strategy for training a CNN using an alternation strategy that offers substantial speedups during training. We make the following contributions: (i) replace the ReLU non-linearity within a CNN with positive hard-thresholding, (ii) reinterpret this non-linearity as a binary state vector making the entire CNN linear if the multi-layer support is known, and (iii) demonstrate that under certain conditions a global optima to the CNN can be found through local descent. We then employ a novel alternation strategy (between weights and support) for CNN training that leads to substantially faster convergence rates, nice theoretical properties, and achieving state of the art results across large scale…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques
MethodsAffine Coupling · Normalizing Flows · *Communicated@Fast*How Do I Communicate to Expedia?
