The Effects of Mixed Sample Data Augmentation are Class Dependent
Haeil Lee, Hansang Lee, Junmo Kim

TL;DR
This paper investigates how Mixed Sample Data Augmentation techniques affect different classes unequally, identifies causes of class dependency, and proposes a training method to mitigate negative impacts and improve overall accuracy.
Contribution
It reveals the class-dependent effects of MSDA and introduces an algorithm that combines MSDA with non-MSDA data to address this issue.
Findings
MSDA causes class-dependent performance variations
Training with mixed MSDA and non-MSDA data mitigates negative effects
Overall accuracy is improved by the proposed method
Abstract
Mixed Sample Data Augmentation (MSDA) techniques, such as Mixup, CutMix, and PuzzleMix, have been widely acknowledged for enhancing performance in a variety of tasks. A previous study reported the class dependency of traditional data augmentation (DA), where certain classes benefit disproportionately compared to others. This paper reveals a class dependent effect of MSDA, where some classes experience improved performance while others experience degraded performance. This research addresses the issue of class dependency in MSDA and proposes an algorithm to mitigate it. The approach involves training on a mixture of MSDA and non-MSDA data, which not only mitigates the negative impact on the affected classes, but also improves overall accuracy. Furthermore, we provide in-depth analysis and discussion of why MSDA introduced class dependencies and which classes are most likely to have them.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare · Advanced Data Compression Techniques · Traffic Prediction and Management Techniques
MethodsMixup · CutMix
