Complex Critical Points of Deep Linear Neural Networks

Ayush Bharadwaj; Serkan Ho\c{s}ten

arXiv:2301.12651·math.AG·January 31, 2023

Complex Critical Points of Deep Linear Neural Networks

Ayush Bharadwaj, Serkan Ho\c{s}ten

PDF

Open Access

TL;DR

This paper analyzes the complex critical points of deep linear neural networks' loss functions, providing bounds, classifications, and computational experiments to better understand their structure and behavior.

Contribution

It extends previous work by offering improved bounds, complete classifications for single hidden layer networks, and computational analysis of deep linear networks.

Findings

01

Zero coordinate critical points arise in specific patterns.

02

Complete classification for networks with one hidden layer.

03

Computational experiments validate theoretical results.

Abstract

We extend the work of Mehta, Chen, Tang, and Hauenstein on computing the complex critical points of the loss function of deep linear neutral networks when the activation function is the identity function. For networks with a single hidden layer trained on a single data point we give an improved bound on the number of complex critical points of the loss function. We show that for any number of hidden layers complex critical points with zero coordinates arise in certain patterns which we completely classify for networks with one hidden layer. We report our results of computational experiments with varying network architectures defining small deep linear networks using HomotopyContinuation.jl.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopological and Geometric Data Analysis · Tensor decomposition and applications · Stochastic Gradient Optimization Techniques