Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep   Learning

Arsenii Ashukha; Alexander Lyzhov; Dmitry Molchanov; Dmitry Vetrov

arXiv:2002.06470·stat.ML·July 20, 2021·79 cites

Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning

Arsenii Ashukha, Alexander Lyzhov, Dmitry Molchanov, Dmitry Vetrov

PDF

Open Access

TL;DR

This paper critically examines in-domain uncertainty estimation in deep learning image classification, highlighting pitfalls in current metrics and demonstrating that many advanced ensembling methods are effectively equivalent to simple ensembles of few networks.

Contribution

It introduces the deep ensemble equivalent score (DEE) to better evaluate ensembling techniques and reveals that many sophisticated methods are similar to small ensembles in performance.

Findings

01

Existing metrics for in-domain uncertainty have significant pitfalls.

02

Many advanced ensembling techniques are equivalent to small ensembles in test performance.

03

The DEE score provides new insights into ensembling effectiveness.

Abstract

Uncertainty estimation and ensembling methods go hand-in-hand. Uncertainty estimation is one of the main benchmarks for assessment of ensembling performance. At the same time, deep learning ensembles have provided state-of-the-art results in uncertainty estimation. In this work, we focus on in-domain uncertainty for image classification. We explore the standards for its quantification and point out pitfalls of existing metrics. Avoiding these pitfalls, we perform a broad study of different ensembling techniques. To provide more insight in this study, we introduce the deep ensemble equivalent score (DEE) and show that many sophisticated ensembling techniques are equivalent to an ensemble of only few independently trained networks in terms of test performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · COVID-19 diagnosis using AI

MethodsTest