Affinity and Diversity: Quantifying Mechanisms of Data Augmentation

Raphael Gontijo-Lopes; Sylvia J. Smullin; Ekin D. Cubuk; Ethan Dyer

arXiv:2002.08973·cs.LG·June 8, 2020·54 cites

Affinity and Diversity: Quantifying Mechanisms of Data Augmentation

Raphael Gontijo-Lopes, Sylvia J. Smullin, Ekin D. Cubuk, Ethan Dyer

PDF

Open Access

TL;DR

This paper introduces interpretable measures called Affinity and Diversity to quantify how data augmentation improves neural network generalization, revealing that optimal augmentation balances both factors.

Contribution

It proposes new metrics for understanding data augmentation effects and demonstrates that combining affinity and diversity predicts augmentation success.

Findings

01

Augmentation performance correlates with joint affinity and diversity.

02

Optimal augmentation balances affinity and diversity.

03

Proposed measures are easy to compute and interpret.

Abstract

Though data augmentation has become a standard component of deep neural network training, the underlying mechanism behind the effectiveness of these techniques remains poorly understood. In practice, augmentation policies are often chosen using heuristics of either distribution shift or augmentation diversity. Inspired by these, we seek to quantify how data augmentation improves model generalization. To this end, we introduce interpretable and easy-to-compute measures: Affinity and Diversity. We find that augmentation performance is predicted not by either of these alone but by jointly optimizing the two.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning