A Depth Hierarchy for Computing the Maximum in ReLU Networks via Extremal Graph Theory

Itay Safran

arXiv:2601.01417·cs.LG·January 6, 2026

A Depth Hierarchy for Computing the Maximum in ReLU Networks via Extremal Graph Theory

Itay Safran

PDF

Open Access

TL;DR

This paper establishes a depth-dependent width lower bound for ReLU networks computing the maximum function, revealing inherent complexity linked to the geometric structure of non-differentiable regions.

Contribution

It introduces a novel combinatorial proof technique using extremal graph theory to derive the first super-linear lower bounds for the maximum function in deep ReLU networks.

Findings

01

Depth hierarchy for maximum computation in ReLU networks.

02

Super-linear width lower bounds at depths ≥3.

03

Graph-theoretic approach links non-differentiable ridges to cliques.

Abstract

We consider the problem of exact computation of the maximum function over $d$ real inputs using ReLU neural networks. We prove a depth hierarchy, wherein width $Ω (d^{1 + \frac{1}{2 ^{k - 2} - 1}})$ is necessary to represent the maximum for any depth $3 \leq k \leq lo g_{2} (lo g_{2} (d))$ . This is the first unconditional super-linear lower bound for this fundamental operator at depths $k \geq 3$ , and it holds even if the depth scales with $d$ . Our proof technique is based on a combinatorial argument and associates the non-differentiable ridges of the maximum with cliques in a graph induced by the first hidden layer of the computing network, utilizing Tur\'an's theorem from extremal graph theory to show that a sufficiently narrow network cannot capture the non-linearities of the maximum. This suggests that despite its simple nature, the maximum function possesses an inherent complexity that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Complexity and Algorithms in Graphs · Advanced Graph Neural Networks