The global optimum of shallow neural network is attained by ridgelet transform
Sho Sonoda, Isao Ishikawa, Masahiro Ikeda, Kei Hagihara, Yoshihiro, Sawano, Takuo Matsubara, Noboru Murata

TL;DR
This paper proves that the global minimum of shallow neural networks trained with backpropagation can be explicitly characterized by the ridgelet transform, linking neural network training to integral geometry.
Contribution
It introduces a continuous neural network model and shows the global optimum is given by the ridgelet transform, providing a new analytical perspective.
Findings
The global minimum corresponds to the ridgelet transform of the target function.
Experimental results show similarity between hidden parameters and ridgelet spectrum.
Explicit global optimizer expression derived via convex optimization in Hilbert space.
Abstract
We prove that the global minimum of the backpropagation (BP) training problem of neural networks with an arbitrary nonlinear activation is given by the ridgelet transform. A series of computational experiments show that there exists an interesting similarity between the scatter plot of hidden parameters in a shallow neural network after the BP training and the spectrum of the ridgelet transform. By introducing a continuous model of neural networks, we reduce the training problem to a convex optimization in an infinite dimensional Hilbert space, and obtain the explicit expression of the global optimizer via the ridgelet transform.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Image and Signal Denoising Methods · Face and Expression Recognition
