Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes
Nikita Kiselev, Andrey Grabovoy

TL;DR
This paper investigates how increasing sample size affects the neural network loss landscape, providing theoretical bounds and empirical evidence for convergence, which aids in understanding neural training dynamics.
Contribution
It offers the first theoretical analysis of loss surface changes with sample size and validates these findings empirically across multiple datasets.
Findings
Loss landscape converges as sample size increases
Theoretical upper bounds for loss differences are established
Empirical results confirm convergence in image classification tasks
Abstract
The loss landscape of neural networks is a critical aspect of their training, and understanding its properties is essential for improving their performance. In this paper, we investigate how the loss surface changes when the sample size increases, a previously unexplored issue. We theoretically analyze the convergence of the loss landscape in a fully connected neural network and derive upper bounds for the difference in loss function values when adding a new object to the sample. Our empirical study confirms these results on various datasets, demonstrating the convergence of the loss function surface for image classification tasks. Our findings provide insights into the local geometry of neural loss landscapes and have implications for the development of sample size determination techniques.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProbability and Risk Models · Innovation Policy and R&D · Firm Innovation and Growth
