Understanding Nonlinear Implicit Bias via Region Counts in Input Space
Jingwei Li, Jing Xu, Zifan Wang, Huishuai Zhang, Jingzhao Zhang

TL;DR
This paper investigates implicit bias in neural networks by analyzing the number of connected regions in input space with the same label, linking region count to generalization and decision boundary complexity.
Contribution
It introduces region count as a new measure of implicit bias in nonlinear models, invariant to reparametrization, and connects it to training hyperparameters and generalization.
Findings
Small region counts correlate with simple decision boundaries.
Hyperparameters like larger learning rates reduce region counts.
Theoretical analysis explains how learning rate influences region count.
Abstract
One explanation for the strong generalization ability of neural networks is implicit bias. Yet, the definition and mechanism of implicit bias in non-linear contexts remains little understood. In this work, we propose to characterize implicit bias by the count of connected regions in the input space with the same predicted label. Compared with parameter-dependent metrics (e.g., norm or normalized margin), region count can be better adapted to nonlinear, overparameterized models, because it is determined by the function mapping and is invariant to reparametrization. Empirically, we found that small region counts align with geometrically simple decision boundaries and correlate well with good generalization performance. We also observe that good hyper-parameter choices such as larger learning rates and smaller batch sizes can induce small region counts. We further establish the theoretical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Face and Expression Recognition · Stochastic Gradient Optimization Techniques
MethodsALIGN
