Degrees of Freedom in Deep Neural Networks
Tianxiang Gao, Vladimir Jojic

TL;DR
This paper investigates the degrees of freedom in deep neural networks, revealing they are significantly smaller than the parameter count and are influenced by architecture and regularization, with implications for model complexity and generalization.
Contribution
It introduces an efficient Monte-Carlo method to estimate degrees of freedom in deep networks and demonstrates their relation to model architecture and regularization effects.
Findings
Degrees of freedom are lower than parameter count in deep networks.
Deeper networks exhibit fewer degrees of freedom, indicating regularization-by-depth.
Degrees of freedom can be orders of magnitude smaller than parameters in real datasets.
Abstract
In this paper, we explore degrees of freedom in deep sigmoidal neural networks. We show that the degrees of freedom in these models is related to the expected optimism, which is the expected difference between test error and training error. We provide an efficient Monte-Carlo method to estimate the degrees of freedom for multi-class classification methods. We show degrees of freedom are lower than the parameter count in a simple XOR network. We extend these results to neural nets trained on synthetic and real data, and investigate impact of network's architecture and different regularization choices. The degrees of freedom in deep networks are dramatically smaller than the number of parameters, in some real datasets several orders of magnitude. Further, we observe that for fixed number of parameters, deeper networks have less degrees of freedom exhibiting a regularization-by-depth.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Gaussian Processes and Bayesian Inference · Model Reduction and Neural Networks
