Why M Heads are Better than One: Training a Diverse Ensemble of Deep   Networks

Stefan Lee; Senthil Purushwalkam; Michael Cogswell; David Crandall,; and Dhruv Batra

arXiv:1511.06314·cs.CV·November 20, 2015·203 cites

Why M Heads are Better than One: Training a Diverse Ensemble of Deep Networks

Stefan Lee, Senthil Purushwalkam, Michael Cogswell, David Crandall,, and Dhruv Batra

PDF

Open Access

TL;DR

This paper explores advanced strategies for creating diverse and effective ensembles of deep neural networks, introducing novel methods like TreeNets and ensemble-aware training to improve performance beyond traditional approaches.

Contribution

It systematically compares ensembling strategies and proposes new methods such as TreeNets and diversity-encouraging losses for end-to-end training of ensembles.

Findings

01

TreeNets improve ensemble performance

02

Diverse ensembles trained end-to-end outperform classical ensembles

03

Ensemble-aware losses increase oracle accuracy

Abstract

Convolutional Neural Networks have achieved state-of-the-art performance on a wide range of tasks. Most benchmarks are led by ensembles of these powerful learners, but ensembling is typically treated as a post-hoc procedure implemented by averaging independently trained models with model variation induced by bagging or random initialization. In this paper, we rigorously treat ensembling as a first-class problem to explicitly address the question: what are the best strategies to create an ensemble? We first compare a large number of ensembling strategies, and then propose and evaluate novel strategies, such as parameter sharing (through a new family of models we call TreeNets) as well as training under ensemble-aware and diversity-encouraging losses. We demonstrate that TreeNets can improve ensemble performance and that diverse ensembles can be trained end-to-end under a unified loss,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Anomaly Detection Techniques and Applications