A simple geometric proof for the benefit of depth in ReLU networks

Asaf Amrami; Yoav Goldberg

arXiv:2101.07126·cs.LG·January 19, 2021

A simple geometric proof for the benefit of depth in ReLU networks

Asaf Amrami, Yoav Goldberg

PDF

Open Access

TL;DR

This paper provides a simple geometric proof demonstrating that deep ReLU networks can efficiently solve certain classification problems that shallow networks require exponentially many parameters to learn.

Contribution

It introduces a straightforward geometric proof for depth separation in neural networks, using space folding, accessible to undergraduates.

Findings

01

Deep networks solve certain problems with linear depth and small width.

02

Shallow networks require exponential parameters for the same problems.

03

The proof is simpler and more accessible than previous complex proofs.

Abstract

We present a simple proof for the benefit of depth in multi-layer feedforward network with rectified activation ("depth separation"). Specifically we present a sequence of classification problems indexed by $m$ such that (a) for any fixed depth rectified network there exist an $m$ above which classifying problem $m$ correctly requires exponential number of parameters (in $m$ ); and (b) for any problem in the sequence, we present a concrete neural network with linear depth (in $m$ ) and small constant width ( $\leq 4$ ) that classifies the problem with zero error. The constructive proof is based on geometric arguments and a space folding construction. While stronger bounds and results exist, our proof uses substantially simpler tools and techniques, and should be accessible to undergraduate students in computer science and people with similar backgrounds.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Advanced Neural Network Applications · Machine Learning and ELM

MethodsDense Connections · Feedforward Network