Exponential Convergence Rates of Classification Errors on Learning with SGD and Random Features
Shingo Yashima, Atsushi Nitanda, Taiji Suzuki

TL;DR
This paper demonstrates that using random features with stochastic gradient descent in binary classification achieves exponential convergence rates in expected error, with computational benefits under low-noise conditions, regardless of feature count.
Contribution
The study extends exponential convergence analysis to random features, showing error bounds are unaffected by the number of features and highlighting computational advantages.
Findings
Exponential convergence of classification error with random features.
Convergence rate independent of the number of features.
Significant computational benefits under low-noise conditions.
Abstract
Although kernel methods are widely used in many learning problems, they have poor scalability to large datasets. To address this problem, sketching and stochastic gradient methods are the most commonly used techniques to derive efficient large-scale learning algorithms. In this study, we consider solving a binary classification problem using random features and stochastic gradient descent. In recent research, an exponential convergence rate of the expected classification error under the strong low-noise condition has been shown. We extend these analyses to a random features setting, analyzing the error induced by the approximation of random features in terms of the distance between the generated hypothesis including population risk minimizers and empirical risk minimizers when using general Lipschitz loss functions, to show that an exponential convergence of the expected classification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Domain Adaptation and Few-Shot Learning
