How does topology influence gradient propagation and model performance   of deep networks with DenseNet-type skip connections?

Kartikeya Bhardwaj; Guihong Li; Radu Marculescu

arXiv:1910.00780·stat.ML·April 2, 2021

How does topology influence gradient propagation and model performance of deep networks with DenseNet-type skip connections?

Kartikeya Bhardwaj, Guihong Li, Radu Marculescu

PDF

2 Repos

TL;DR

This paper investigates how the topology of skip connections in deep neural networks affects gradient flow and performance, introducing the NN-Mass metric to predict and optimize model accuracy and efficiency.

Contribution

The paper introduces NN-Mass, a new metric linking network topology to gradient propagation, applicable across various skip connection types, enabling better model design and compression.

Findings

01

NN-Mass predicts model accuracy across different architectures.

02

NN-Mass enables design of compressed networks at initialization.

03

Empirical validation on datasets like CIFAR and ImageNet supports the metric's effectiveness.

Abstract

DenseNets introduce concatenation-type skip connections that achieve state-of-the-art accuracy in several computer vision tasks. In this paper, we reveal that the topology of the concatenation-type skip connections is closely related to the gradient propagation which, in turn, enables a predictable behavior of DNNs' test performance. To this end, we introduce a new metric called NN-Mass to quantify how effectively information flows through DNNs. Moreover, we empirically show that NN-Mass also works for other types of skip connections, e.g., for ResNets, Wide-ResNets (WRNs), and MobileNets, which contain addition-type skip connections (i.e., residuals or inverted residuals). As such, for both DenseNet-like CNNs and ResNets/WRNs/MobileNets, our theoretically grounded NN-Mass can identify models with similar accuracy, despite having significantly different size/compute requirements.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest