Deep Ensembles: A Loss Landscape Perspective

Stanislav Fort; Huiyi Hu; Balaji Lakshminarayanan

arXiv:1912.02757·stat.ML·June 26, 2020·350 cites

Deep Ensembles: A Loss Landscape Perspective

Stanislav Fort, Huiyi Hu, Balaji Lakshminarayanan

PDF

Open Access 1 Repo 1 Video

TL;DR

Deep ensembles improve model accuracy and robustness by exploring diverse modes in the loss landscape, which is not fully captured by Bayesian methods or subspace sampling techniques.

Contribution

This paper investigates the loss landscape of deep ensembles, revealing that random initializations explore diverse modes, explaining their effectiveness over Bayesian approaches.

Findings

01

Random initializations explore different modes in the loss landscape.

02

Functions along optimization trajectories cluster within a single mode.

03

Random initializations have unmatched decorrelation power in the diversity--accuracy plane.

Abstract

Deep ensembles have been empirically shown to be a promising approach for improving accuracy, uncertainty and out-of-distribution robustness of deep learning models. While deep ensembles were theoretically motivated by the bootstrap, non-bootstrap ensembles trained with just random initialization also perform well in practice, which suggests that there could be other explanations for why deep ensembles work well. Bayesian neural networks, which learn distributions over the parameters of the network, are theoretically well-motivated by Bayesian principles, but do not perform as well as deep ensembles in practice, particularly under dataset shift. One possible explanation for this gap between theory and practice is that popular scalable variational Bayesian methods tend to focus on a single mode, whereas deep ensembles tend to explore diverse modes in function space. We investigate this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ayulockin/LossLandscape
none

Videos

Deep Ensembles: A Loss Landscape Perspective (Paper Explained)· youtube

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques

MethodsDeep Ensembles · Average Pooling · 1x1 Convolution · Residual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Max Pooling · Global Average Pooling · Bottleneck Residual Block · Residual Block