Global Mixup: Eliminating Ambiguity with Clustering

Xiangjin Xie; Yangning Li; Wang Chen; Kai Ouyang; Li Jiang; and Haitao Zheng

arXiv:2206.02734·cs.LG·June 7, 2022

Global Mixup: Eliminating Ambiguity with Clustering

Xiangjin Xie, Yangning Li, Wang Chen, Kai Ouyang, Li Jiang, and Haitao Zheng

PDF

Open Access 1 Video

TL;DR

Global Mixup introduces a two-stage data augmentation method that uses clustering to generate more reliable virtual samples, improving model performance across various neural network architectures and tasks.

Contribution

The paper proposes a novel two-stage augmentation approach that decouples sample generation from labeling using clustering, expanding sampling space and reducing label ambiguity.

Findings

01

Significantly outperforms state-of-the-art baselines on multiple tasks.

02

Effective in low-resource scenarios.

03

Applicable to CNN, LSTM, and BERT models.

Abstract

Data augmentation with \textbf{Mixup} has been proven an effective method to regularize the current deep neural networks. Mixup generates virtual samples and corresponding labels at once through linear interpolation. However, this one-stage generation paradigm and the use of linear interpolation have the following two defects: (1) The label of the generated sample is directly combined from the labels of the original sample pairs without reasonable judgment, which makes the labels likely to be ambiguous. (2) linear combination significantly limits the sampling space for generating samples. To tackle these problems, we propose a novel and effective augmentation method based on global clustering relationships named \textbf{Global Mixup}. Specifically, we transform the previous one-stage augmentation process into two-stage, decoupling the process of generating virtual samples from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Global Mixup: Eliminating Ambiguity with Clustering· underline

Taxonomy

TopicsHuman Pose and Action Recognition · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification

MethodsMixup