Deep Networks Learn Deep Hierarchical Models

Amit Daniely

arXiv:2601.00455·cs.LG·January 5, 2026

Deep Networks Learn Deep Hierarchical Models

Amit Daniely

PDF

Open Access

TL;DR

This paper demonstrates that layerwise SGD on residual networks can efficiently learn complex hierarchical models, surpassing previous models in depth and providing insights into deep learning's ability to learn structured, layered representations.

Contribution

It introduces a class of hierarchical models that are learnable by deep networks at the depth limit of efficient algorithms, extending prior work on deep learning capabilities.

Findings

01

Layerwise SGD effectively learns hierarchical models.

02

Models in this class require polynomial depth, exceeding previous log-depth models.

03

Hierarchical structures may underpin deep learning success.

Abstract

We consider supervised learning with $n$ labels and show that layerwise SGD on residual networks can efficiently learn a class of hierarchical models. This model class assumes the existence of an (unknown) label hierarchy $L_{1} \subseteq L_{2} \subseteq \dots \subseteq L_{r} = [n]$ , where labels in $L_{1}$ are simple functions of the input, while for $i > 1$ , labels in $L_{i}$ are simple functions of simpler labels. Our class surpasses models that were previously shown to be learnable by deep learning algorithms, in the sense that it reaches the depth limit of efficient learnability. That is, there are models in this class that require polynomial depth to express, whereas previous models can be computed by log-depth circuits. Furthermore, we suggest that learnability of such hierarchical models might eventually form a basis for understanding deep learning. Beyond their natural fit for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning