Increasing Depth Leads to U-Shaped Test Risk in Over-parameterized   Convolutional Networks

Eshaan Nichani; Adityanarayanan Radhakrishnan; Caroline Uhler

arXiv:2010.09610·cs.LG·June 8, 2021·5 cites

Increasing Depth Leads to U-Shaped Test Risk in Over-parameterized Convolutional Networks

Eshaan Nichani, Adityanarayanan Radhakrishnan, Caroline Uhler

PDF

Open Access

TL;DR

This paper investigates how increasing depth in over-parameterized convolutional networks affects test risk, revealing a U-shaped relationship where risk first decreases then increases with depth, supported by empirical and theoretical analysis.

Contribution

It introduces the first comprehensive analysis of depth's impact on test risk in over-parameterized CNNs, combining empirical evidence with a novel linear regression framework and theoretical insights.

Findings

01

Test risk exhibits a U-shaped curve with increasing depth.

02

Increasing depth can both decrease and increase test risk depending on the regime.

03

Theoretical analysis identifies depths that minimize bias and variance components.

Abstract

Recent works have demonstrated that increasing model capacity through width in over-parameterized neural networks leads to a decrease in test risk. For neural networks, however, model capacity can also be increased through depth, yet understanding the impact of increasing depth on test risk remains an open question. In this work, we demonstrate that the test risk of over-parameterized convolutional networks is a U-shaped curve (i.e. monotonically decreasing, then increasing) with increasing depth. We first provide empirical evidence for this phenomenon via image classification experiments using both ResNets and the convolutional neural tangent kernel (CNTK). We then present a novel linear regression framework for characterizing the impact of depth on test risk, and show that increasing depth leads to a U-shaped test risk for the linear CNTK. In particular, we prove that the linear CNTK…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning

MethodsLinear Regression