Zero Stability Well Predicts Performance of Convolutional Neural Networks
Liangming Chen, Long Jin, Mingsheng Shang

TL;DR
This paper establishes a connection between zero stability in numerical analysis and CNN performance, introducing ZeroSNet, a zero-stable CNN architecture that outperforms existing models and is more robust to noise.
Contribution
The paper proposes ZeroSNet, a novel zero-stable CNN based on higher-order discretization, with theoretical guarantees and empirical evidence of improved performance and robustness.
Findings
ZeroSNet outperforms existing CNNs on multiple datasets.
ZeroSNet demonstrates enhanced robustness against input noise.
The stability of CNNs correlates with the roots of the characteristic equation.
Abstract
The question of what kind of convolutional neural network (CNN) structure performs well is fascinating. In this work, we move toward the answer with one more step by connecting zero stability and model performance. Specifically, we found that if a discrete solver of an ordinary differential equation is zero stable, the CNN corresponding to that solver performs well. We first give the interpretation of zero stability in the context of deep learning and then investigate the performance of existing first- and second-order CNNs under different zero-stable circumstances. Based on the preliminary observation, we provide a higher-order discretization to construct CNNs and then propose a zero-stable network (ZeroSNet). To guarantee zero stability of the ZeroSNet, we first deduce a structure that meets consistency conditions and then give a zero stable region of a training-free parameter. By…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Applications
