Provable Benefit of Cutout and CutMix for Feature Learning

Junsoo Oh; Chulhee Yun

arXiv:2410.23672·cs.LG·November 1, 2024

Provable Benefit of Cutout and CutMix for Feature Learning

Junsoo Oh, Chulhee Yun

PDF

Open Access 1 Video

TL;DR

This paper provides a theoretical analysis showing that Cutout and CutMix augmentations enable neural networks to learn a broader range of features, with CutMix offering the most comprehensive feature learning and highest test accuracy.

Contribution

It offers the first theoretical understanding of how Cutout and CutMix improve feature learning in neural networks, especially regarding rare and noisy features.

Findings

01

Cutout learns low-frequency features that vanilla training cannot.

02

CutMix learns rarer features that Cutout cannot.

03

CutMix achieves the highest test accuracy among the methods.

Abstract

Patch-level data augmentation techniques such as Cutout and CutMix have demonstrated significant efficacy in enhancing the performance of vision tasks. However, a comprehensive theoretical understanding of these methods remains elusive. In this paper, we study two-layer neural networks trained using three distinct methods: vanilla training without augmentation, Cutout training, and CutMix training. Our analysis focuses on a feature-noise data model, which consists of several label-dependent features of varying rarity and label-independent noises of differing strengths. Our theorems demonstrate that Cutout training can learn low-frequency features that vanilla training cannot, while CutMix training can learn even rarer features that Cutout cannot capture. From this, we establish that CutMix yields the highest test accuracy among the three. Our novel analysis reveals that CutMix training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Provable Benefit of Cutout and CutMix for Feature Learning· slideslive

Taxonomy

TopicsMachine Learning and Data Classification

MethodsCutout · CutMix