Stabilized Sparse Online Learning for Sparse Data

Yuting Ma; Tian Zheng

arXiv:1604.06498·stat.ML·May 10, 2017·1 cites

Stabilized Sparse Online Learning for Sparse Data

Yuting Ma, Tian Zheng

PDF

Open Access

TL;DR

This paper introduces a stabilized sparse online learning algorithm that improves convergence, stability, and accuracy in high-dimensional sparse data scenarios by adaptively controlling feature shrinkage and variability.

Contribution

The paper proposes a stabilized truncated stochastic gradient descent method with adaptive shrinkage, stability selection, and annealing strategies for better sparse learning.

Findings

01

Improved prediction accuracy over previous methods

02

Achieved sparser and more stable models

03

Faster convergence in high-dimensional settings

Abstract

Stochastic gradient descent (SGD) is commonly used for optimization in large-scale machine learning problems. Langford et al. (2009) introduce a sparse online learning method to induce sparsity via truncated gradient. With high-dimensional sparse data, however, the method suffers from slow convergence and high variance due to the heterogeneity in feature sparsity. To mitigate this issue, we introduce a stabilized truncated stochastic gradient descent algorithm. We employ a soft-thresholding scheme on the weight vector where the imposed shrinkage is adaptive to the amount of information available in each feature. The variability in the resulted sparse weight vector is further controlled by stability selection integrated with the informative truncation. To facilitate better convergence, we adopt an annealing strategy on the truncation rate, which leads to a balanced trade-off between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Machine Learning and ELM