Exponential Convergence Rates of Classification Errors on Learning with   SGD and Random Features

Shingo Yashima; Atsushi Nitanda; Taiji Suzuki

arXiv:1911.05350·stat.ML·June 3, 2022·1 cites

Exponential Convergence Rates of Classification Errors on Learning with SGD and Random Features

Shingo Yashima, Atsushi Nitanda, Taiji Suzuki

PDF

Open Access

TL;DR

This paper demonstrates that using random features with stochastic gradient descent in binary classification achieves exponential convergence rates in expected error, with computational benefits under low-noise conditions, regardless of feature count.

Contribution

The study extends exponential convergence analysis to random features, showing error bounds are unaffected by the number of features and highlighting computational advantages.

Findings

01

Exponential convergence of classification error with random features.

02

Convergence rate independent of the number of features.

03

Significant computational benefits under low-noise conditions.

Abstract

Although kernel methods are widely used in many learning problems, they have poor scalability to large datasets. To address this problem, sketching and stochastic gradient methods are the most commonly used techniques to derive efficient large-scale learning algorithms. In this study, we consider solving a binary classification problem using random features and stochastic gradient descent. In recent research, an exponential convergence rate of the expected classification error under the strong low-noise condition has been shown. We extend these analyses to a random features setting, analyzing the error induced by the approximation of random features in terms of the distance between the generated hypothesis including population risk minimizers and empirical risk minimizers when using general Lipschitz loss functions, to show that an exponential convergence of the expected classification…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Domain Adaptation and Few-Shot Learning