The activity-weight duality in feed forward neural networks: The geometric determinants of generalization
Yu Feng, Yuhai Tu

TL;DR
This paper uncovers an exact activity-weight duality in feedforward neural networks, linking input variations to weight changes, and reveals how geometric factors like landscape sharpness and weight norms influence generalization.
Contribution
It introduces the activity-weight duality and a geometric framework for understanding generalization in neural networks, connecting regularization and training factors to solution geometry.
Findings
Generalization loss decomposed into eigen-direction contributions
Sharpness and weight norm scale influence generalization
Regularization schemes affect geometric determinants of generalization
Abstract
One of the fundamental problems in machine learning is generalization. In neural network models with a large number of weights (parameters), many solutions can be found to fit the training data equally well. The key question is which solution can describe testing data not in the training set. Here, we report the discovery of an exact duality (equivalence) between changes in activities in a given layer of neurons and changes in weights that connect to the next layer of neurons in a densely connected layer in any feed forward neural network. The activity-weight (A-W) duality allows us to map variations in inputs (data) to variations of the corresponding dual weights. By using this mapping, we show that the generalization loss can be decomposed into a sum of contributions from different eigen-directions of the Hessian matrix of the loss function at the solution in weight space. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning in Materials Science · Machine Learning and ELM
