Linearly Convergent Mixup Learning
Gakuto Obi, Ayato Saito, Yuto Sasaki, Tsuyoshi Kato

TL;DR
This paper introduces two new algorithms for mixup data augmentation in RKHS-based binary classification, which are hyperparameter-free, scale linearly, and outperform gradient descent in convergence speed and predictive accuracy.
Contribution
The paper proposes two novel, hyperparameter-free algorithms for mixup learning in RKHS that are more efficient and broadly applicable than existing gradient-based methods.
Findings
Algorithms converge faster than gradient descent.
Mixup improves predictive performance across loss functions.
Methods scale linearly with dataset size.
Abstract
Learning in the reproducing kernel Hilbert space (RKHS) such as the support vector machine has been recognized as a promising technique. It continues to be highly effective and competitive in numerous prediction tasks, particularly in settings where there is a shortage of training data or computational limitations exist. These methods are especially valued for their ability to work with small datasets and their interpretability. To address the issue of limited training data, mixup data augmentation, widely used in deep learning, has remained challenging to apply to learning in RKHS due to the generation of intermediate class labels. Although gradient descent methods handle these labels effectively, dual optimization approaches are typically not directly applicable. In this study, we present two novel algorithms that extend to a broader range of binary classification models. Unlike…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Face and Expression Recognition · Machine Learning and ELM
MethodsMixup
