From complex to simple : hierarchical free-energy landscape renormalized   in deep neural networks

Hajime Yoshino

arXiv:1910.09918·cond-mat.dis-nn·April 17, 2020

From complex to simple : hierarchical free-energy landscape renormalized in deep neural networks

Hajime Yoshino

PDF

TL;DR

This paper uses a statistical mechanical approach to analyze the configuration space of deep neural networks, revealing hierarchical free-energy landscapes and phase transitions that explain their efficiency.

Contribution

It introduces a replica-based framework to study the layered phase transitions and free-energy landscape evolution in deep neural networks, highlighting the hierarchical complexity reduction.

Findings

01

Successive phase transitions occur layer-by-layer with increasing data.

02

The free-energy landscape becomes simpler in deeper layers due to renormalization.

03

Deep networks' capacity grows exponentially with depth, with a hierarchical structure.

Abstract

We develop a statistical mechanical approach based on the replica method to study the design space of deep and wide neural networks constrained to meet a large number of training data. Specifically, we analyze the configuration space of the synaptic weights and neurons in the hidden layers in a simple feed-forward perceptron network for two scenarios: a setting with random inputs/outputs and a teacher-student setting. By increasing the strength of constraints,~i.e. increasing the number of training data, successive 2nd order glass transition (random inputs/outputs) or 2nd order crystalline transition (teacher-student setting) take place layer-by-layer starting next to the inputs/outputs boundaries going deeper into the bulk with the thickness of the solid phase growing logarithmically with the data size. This implies the typical storage capacity of the network grows exponentially fast…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.