Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the   Hessian

Jack Parker-Holder; Luke Metz; Cinjon Resnick; Hengyuan Hu; Adam; Lerer; Alistair Letcher; Alex Peysakhovich; Aldo Pacchiano; Jakob Foerster

arXiv:2011.06505·cs.LG·November 13, 2020

Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian

Jack Parker-Holder, Luke Metz, Cinjon Resnick, Hengyuan Hu, Adam, Lerer, Alistair Letcher, Alex Peysakhovich, Aldo Pacchiano, Jakob Foerster

PDF

Open Access 1 Video

TL;DR

Ridge Rider (RR) is a novel method that explores the loss surface of neural networks by following Hessian eigenvectors, enabling the discovery of diverse solutions beyond those found by standard gradient descent.

Contribution

The paper introduces Ridge Rider, a new approach that follows Hessian eigenvectors to find qualitatively different solutions in neural network training.

Findings

01

RR can find diverse solutions by following Hessian eigenvectors.

02

Theoretical analysis shows RR effectively spans the loss surface.

03

Experimental results demonstrate RR's ability to discover solutions SGD may miss.

Abstract

Over the last decade, a single algorithm has changed many facets of our lives - Stochastic Gradient Descent (SGD). In the era of ever decreasing loss functions, SGD and its various offspring have become the go-to optimization tool in machine learning and are a key component of the success of deep neural networks (DNNs). While SGD is guaranteed to converge to a local optimum (under loose assumptions), in some cases it may matter which local optimum is found, and this is often context-dependent. Examples frequently arise in machine learning, from shape-versus-texture-features to ensemble methods and zero-shot coordination. In these settings, there are desired solutions which SGD on 'standard' loss functions will not find, since it instead converges to the 'easy' solutions. In this paper, we present a different approach. Rather than following the gradient, which corresponds to a locally…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian· slideslive

Taxonomy

TopicsComputability, Logic, AI Algorithms · Artificial Intelligence in Games · Distributed and Parallel Computing Systems

MethodsStochastic Gradient Descent