Error Bound Analysis for the Regularized Loss of Deep Linear Neural Networks

Po Chen; Rujun Jiang; Peng Wang

arXiv:2502.11152·math.OC·September 24, 2025

Error Bound Analysis for the Regularized Loss of Deep Linear Neural Networks

Po Chen, Rujun Jiang, Peng Wang

PDF

Open Access

TL;DR

This paper analyzes the local geometric landscape of the regularized loss in deep linear networks, providing error bounds and convergence guarantees for gradient descent.

Contribution

It offers a closed-form characterization of critical points and establishes an error bound that leads to linear convergence analysis for first-order methods.

Findings

01

Gradient descent converges linearly to critical points.

02

The error bound relates gradient norm to distance from critical points.

03

Theoretical results are supported by numerical experiments.

Abstract

The optimization foundations of deep linear networks have recently received significant attention. However, due to their inherent non-convexity and hierarchical structure, analyzing the loss functions of deep linear networks remains a challenging task. In this work, we study the local geometric landscape of the regularized squared loss of deep linear networks around each critical point. Specifically, we derive a closed-form characterization of the critical point set and establish an error bound for the regularized loss under mild conditions on network width and regularization parameters. Notably, this error bound quantifies the distance from a point to the critical point set in terms of the current gradient norm, which can be used to derive linear convergence of first-order methods. To support our theoretical findings, we conduct numerical experiments and demonstrate that gradient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsSoftmax · Attention Is All You Need · Sparse Evolutionary Training