The Landscape of Multi-Layer Linear Neural Network From the Perspective of Algebraic Geometry
Xiuyi Yang

TL;DR
This paper applies algebraic geometry to analyze the non-convex landscape of linear residual neural networks, revealing geometric structures and properties of critical points to better understand their optimization behavior.
Contribution
It introduces an algebraic geometry framework to decompose the landscape of linear networks into irreducible geometric objects, providing new insights into their critical points and residual connections.
Findings
Decomposition of the landscape into irreducible geometric objects
Proposed hypotheses on loss calculation and critical point properties
Numerical verification supports the geometric analysis
Abstract
The clear understanding of the non-convex landscape of neural network is a complex incomplete problem. This paper studies the landscape of linear (residual) network, the simplified version of the nonlinear network. By treating the gradient equations as polynomial equations, we use algebraic geometry tools to solve it over the complex number field, the attained solution can be decomposed into different irreducible complex geometry objects. Then three hypotheses are proposed, involving how to calculate the loss on each irreducible geometry object, the losses of critical points have a certain range and the relationship between the dimension of each irreducible geometry object and strict saddle condition. Finally, numerical algebraic geometry is applied to verify the rationality of these three hypotheses which further clarify the landscape of linear network and the role of residual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM · Model Reduction and Neural Networks
