TL;DR
This paper investigates how the topology of skip connections in deep neural networks affects gradient flow and performance, introducing the NN-Mass metric to predict and optimize model accuracy and efficiency.
Contribution
The paper introduces NN-Mass, a new metric linking network topology to gradient propagation, applicable across various skip connection types, enabling better model design and compression.
Findings
NN-Mass predicts model accuracy across different architectures.
NN-Mass enables design of compressed networks at initialization.
Empirical validation on datasets like CIFAR and ImageNet supports the metric's effectiveness.
Abstract
DenseNets introduce concatenation-type skip connections that achieve state-of-the-art accuracy in several computer vision tasks. In this paper, we reveal that the topology of the concatenation-type skip connections is closely related to the gradient propagation which, in turn, enables a predictable behavior of DNNs' test performance. To this end, we introduce a new metric called NN-Mass to quantify how effectively information flows through DNNs. Moreover, we empirically show that NN-Mass also works for other types of skip connections, e.g., for ResNets, Wide-ResNets (WRNs), and MobileNets, which contain addition-type skip connections (i.e., residuals or inverted residuals). As such, for both DenseNet-like CNNs and ResNets/WRNs/MobileNets, our theoretically grounded NN-Mass can identify models with similar accuracy, despite having significantly different size/compute requirements.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest
