Why M Heads are Better than One: Training a Diverse Ensemble of Deep Networks
Stefan Lee, Senthil Purushwalkam, Michael Cogswell, David Crandall,, and Dhruv Batra

TL;DR
This paper explores advanced strategies for creating diverse and effective ensembles of deep neural networks, introducing novel methods like TreeNets and ensemble-aware training to improve performance beyond traditional approaches.
Contribution
It systematically compares ensembling strategies and proposes new methods such as TreeNets and diversity-encouraging losses for end-to-end training of ensembles.
Findings
TreeNets improve ensemble performance
Diverse ensembles trained end-to-end outperform classical ensembles
Ensemble-aware losses increase oracle accuracy
Abstract
Convolutional Neural Networks have achieved state-of-the-art performance on a wide range of tasks. Most benchmarks are led by ensembles of these powerful learners, but ensembling is typically treated as a post-hoc procedure implemented by averaging independently trained models with model variation induced by bagging or random initialization. In this paper, we rigorously treat ensembling as a first-class problem to explicitly address the question: what are the best strategies to create an ensemble? We first compare a large number of ensembling strategies, and then propose and evaluate novel strategies, such as parameter sharing (through a new family of models we call TreeNets) as well as training under ensemble-aware and diversity-encouraging losses. We demonstrate that TreeNets can improve ensemble performance and that diverse ensembles can be trained end-to-end under a unified loss,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Anomaly Detection Techniques and Applications
