Sparse Online Learning via Truncated Gradient
John Langford, Lihong Li, Tong Zhang

TL;DR
This paper introduces a truncated gradient method for online learning that induces controllable sparsity in model weights, balancing sparsity and regret, and demonstrates its effectiveness on large-feature datasets.
Contribution
The paper presents a novel online sparsification technique called truncated gradient, extending $L_1$ regularization to online learning with theoretical regret guarantees.
Findings
Small sparsification rates cause minimal regret increase.
Method achieves substantial sparsity on datasets with many features.
Empirical results confirm the approach's effectiveness.
Abstract
We propose a general method called truncated gradient to induce sparsity in the weights of online learning algorithms with convex loss functions. This method has several essential properties: The degree of sparsity is continuous -- a parameter controls the rate of sparsification from no sparsification to total sparsification. The approach is theoretically motivated, and an instance of it can be regarded as an online counterpart of the popular -regularization method in the batch setting. We prove that small rates of sparsification result in only small additional regret with respect to typical online learning guarantees. The approach works well empirically. We apply the approach to several datasets and find that for datasets with large numbers of features, substantial sparsity is discoverable.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Advanced Bandit Algorithms Research · Machine Learning and Algorithms
