No One Representation to Rule Them All: Overlapping Features of Training   Methods

Raphael Gontijo-Lopes; Yann Dauphin; Ekin D. Cubuk

arXiv:2110.12899·cs.LG·April 27, 2022·5 cites

No One Representation to Rule Them All: Overlapping Features of Training Methods

Raphael Gontijo-Lopes, Yann Dauphin, Ekin D. Cubuk

PDF

Open Access 1 Video

TL;DR

This paper empirically investigates how different training methods produce models with diverse generalization behaviors and representations, leading to improved ensemble performance and insights into feature overlap.

Contribution

It provides a large-scale empirical analysis showing that models trained with different methodologies learn diverse features and errors, enhancing ensemble effectiveness.

Findings

01

Models with different training methods have less correlated errors.

02

Ensembles of diverse models improve accuracy by up to 7%.

03

Low-accuracy models can still enhance high-accuracy models when combined.

Abstract

Despite being able to capture a range of features of the data, high accuracy models trained with supervision tend to make similar predictions. This seemingly implies that high-performing models share similar biases regardless of training methodology, which would limit ensembling benefits and render low-accuracy models as having little practical use. Against this backdrop, recent work has developed quite different training techniques, such as large-scale contrastive learning, yielding competitively high accuracy on generalization and robustness benchmarks. This motivates us to revisit the assumption that models necessarily learn similar functions. We conduct a large-scale empirical study of models across hyper-parameters, architectures, frameworks, and datasets. We find that model pairs that diverge more in training methodology display categorically different generalization behavior,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No One Representation to Rule Them All: Overlapping Features of Training Methods· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning