Force balance controls the relaxation time of the gradient descent   algorithm in the satisfiable phase

Sungmin Hwang; Harukuni Ikeda

arXiv:1910.07307·cond-mat.dis-nn·May 20, 2020

Force balance controls the relaxation time of the gradient descent algorithm in the satisfiable phase

Sungmin Hwang, Harukuni Ikeda

PDF

TL;DR

This paper investigates how force balance influences the relaxation time of gradient descent in a neural network model near the SAT-UNSAT transition, revealing a critical eigenvalue behavior and connections to jamming phenomena.

Contribution

It demonstrates that the relaxation time is governed by the first Hessian eigenvalue and links the eigenvalue's behavior to force balance at the transition, providing new insights into optimization dynamics.

Findings

01

Relaxation time diverges near the SAT-UNSAT transition.

02

First Hessian eigenvalue controls relaxation dynamics.

03

Critical exponent matches that of jamming transitions.

Abstract

We numerically study the relaxation dynamics of the single layer perceptron with the spherical constraint. This is the simplest model of neural networks and serves a prototypical mean-field model of both convex and non-convex optimization problems. The relaxation time of the gradient descent algorithm rapidly increases near the SAT-UNSAT transition point. We numerically confirm that the first non-zero eigenvalue of the Hessian controls the relaxation time. This first eigenvalue vanishes much faster upon approaching the SAT-UNSAT transition point than the prediction of Marchenko-Pastur law in random matrix theory derived under the assumption that the set of unsatisfied constraints are uncorrelated. This leads to a non-trivial critical exponent of the relaxation time in the SAT phase. Using a simple scaling analysis, we show that the isolation of this first eigenvalue from the bulk of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.