Complex Critical Points of Deep Linear Neural Networks
Ayush Bharadwaj, Serkan Ho\c{s}ten

TL;DR
This paper analyzes the complex critical points of deep linear neural networks' loss functions, providing bounds, classifications, and computational experiments to better understand their structure and behavior.
Contribution
It extends previous work by offering improved bounds, complete classifications for single hidden layer networks, and computational analysis of deep linear networks.
Findings
Zero coordinate critical points arise in specific patterns.
Complete classification for networks with one hidden layer.
Computational experiments validate theoretical results.
Abstract
We extend the work of Mehta, Chen, Tang, and Hauenstein on computing the complex critical points of the loss function of deep linear neutral networks when the activation function is the identity function. For networks with a single hidden layer trained on a single data point we give an improved bound on the number of complex critical points of the loss function. We show that for any number of hidden layers complex critical points with zero coordinates arise in certain patterns which we completely classify for networks with one hidden layer. We report our results of computational experiments with varying network architectures defining small deep linear networks using HomotopyContinuation.jl.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Tensor decomposition and applications · Stochastic Gradient Optimization Techniques
