Numerical Exploration of Training Loss Level-Sets in Deep Neural   Networks

Naveed Tahir; Garrett E. Katz

arXiv:2011.04189·cs.LG·April 27, 2021

Numerical Exploration of Training Loss Level-Sets in Deep Neural Networks

Naveed Tahir, Garrett E. Katz

PDF

TL;DR

This paper introduces a computational approach to explore the structure of training loss level-sets in deep neural networks, revealing insights into the loss landscape and generalization, and proposing a new strategy for reducing test loss.

Contribution

It presents a novel method for empirically characterizing loss level-sets and comparing their properties with traditional regularization techniques in deep neural networks.

Findings

01

Loss level-sets form complex structures in parameter space.

02

Certain points within level-sets exhibit better generalization.

03

The method offers a new visualization of the loss landscape.

Abstract

We present a computational method for empirically characterizing the training loss level-sets of deep neural networks. Our method numerically constructs a path in parameter space that is constrained to a set with a fixed near-zero training loss. By measuring regularization functions and test loss at different points within this path, we examine how different points in the parameter space with the same fixed training loss compare in terms of generalization ability. We also compare this method for finding regularized points with the more typical method, that uses objective functions which are weighted sums of training loss and regularization terms. We apply dimensionality reduction to the traversed paths in order to visualize the loss level sets in a well-regularized region of parameter space. Our results provide new information about the loss landscape of deep neural networks, as well as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.