Active Labeling: Streaming Stochastic Gradients
Vivien Cabannes, Francis Bach, Vianney Perchet, Alessandro Rudi

TL;DR
This paper introduces active labeling, a method for obtaining stochastic gradients with partial supervision in a streaming setting, aiming to minimize generalization error efficiently.
Contribution
It formalizes the active labeling problem and proposes a streaming technique with theoretical guarantees for minimizing error per sample.
Findings
The proposed method effectively reduces generalization error with fewer samples.
The technique is demonstrated in the context of robust regression.
Theoretical analysis confirms the efficiency of the streaming approach.
Abstract
The workhorse of machine learning is stochastic gradient descent. To access stochastic gradients, it is common to consider iteratively input/output pairs of a training dataset. Interestingly, it appears that one does not need full supervision to access stochastic gradients, which is the main motivation of this paper. After formalizing the "active labeling" problem, which focuses on active learning with partial supervision, we provide a streaming technique that provably minimizes the ratio of generalization error over the number of samples. We illustrate our technique in depth for robust regression.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Markov Chains and Monte Carlo Methods
