Upper Bounds for Local Learning Coefficients of Three-Layer Neural Networks
Yuki Kurumadani

TL;DR
This paper derives an upper-bound formula for the local learning coefficient at singular points in three-layer neural networks, applicable to various activation functions, and clarifies its relation to known coefficients in one-dimensional input cases.
Contribution
It introduces a new upper-bound formula for the local learning coefficient at singular points in three-layer neural networks, extending applicability to a broader class of activation functions.
Findings
The upper bound matches known coefficients when input dimension is one.
The formula applies to general analytic activation functions, including swish and polynomial.
Provides a systematic understanding of how network parameters influence the learning coefficient.
Abstract
Three-layer neural networks are known to form singular learning models, and their Bayesian asymptotic behavior is governed by the learning coefficient, or real log canonical threshold. Although this quantity has been clarified for regular models and for some special singular models, broadly applicable methods for evaluating it in neural networks remain limited. Recently, a formula for the local learning coefficient of semiregular models was proposed, yielding an upper bound on the learning coefficient. However, this formula applies only to nonsingular points in the set of realization parameters and cannot be used at singular points. In particular, for three-layer neural networks, the resulting upper bound has been shown to differ substantially from learning coefficient values already known in some cases. In this paper, we derive an upper-bound formula for the local learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM · Stochastic Gradient Optimization Techniques
