RC-Mixup: A Data Augmentation Strategy against Noisy Data for Regression Tasks
Seong-Hyeon Hwang, Minsu Kim, Steven Euijong Whang

TL;DR
RC-Mixup is a novel data augmentation strategy that combines C-Mixup with multi-round robust training to improve regression model performance on noisy data, leveraging a synergistic, data-centric approach.
Contribution
The paper introduces RC-Mixup, a new data augmentation method that enhances robust training for regression tasks with noisy data by integrating C-Mixup with multi-round training without modifying existing algorithms.
Findings
RC-Mixup significantly outperforms C-Mixup and robust training baselines.
RC-Mixup can be integrated with various robust training methods.
RC-Mixup improves data quality for better regression performance.
Abstract
We study the problem of robust data augmentation for regression tasks in the presence of noisy data. Data augmentation is essential for generalizing deep learning models, but most of the techniques like the popular Mixup are primarily designed for classification tasks on image data. Recently, there are also Mixup techniques that are specialized to regression tasks like C-Mixup. In comparison to Mixup, which takes linear interpolations of pairs of samples, C-Mixup is more selective in which samples to mix based on their label distances for better regression performance. However, C-Mixup does not distinguish noisy versus clean samples, which can be problematic when mixing and lead to suboptimal model performance. At the same time, robust training has been heavily studied where the goal is to train accurate models against noisy data through multiple rounds of model training. We thus…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Data Classification · Gaussian Processes and Bayesian Inference
MethodsMixup
