Active Labeling: Streaming Stochastic Gradients

Vivien Cabannes; Francis Bach; Vianney Perchet; Alessandro Rudi

arXiv:2205.13255·cs.LG·December 8, 2022

Active Labeling: Streaming Stochastic Gradients

Vivien Cabannes, Francis Bach, Vianney Perchet, Alessandro Rudi

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces active labeling, a method for obtaining stochastic gradients with partial supervision in a streaming setting, aiming to minimize generalization error efficiently.

Contribution

It formalizes the active labeling problem and proposes a streaming technique with theoretical guarantees for minimizing error per sample.

Findings

01

The proposed method effectively reduces generalization error with fewer samples.

02

The technique is demonstrated in the context of robust regression.

03

Theoretical analysis confirms the efficiency of the streaming approach.

Abstract

The workhorse of machine learning is stochastic gradient descent. To access stochastic gradients, it is common to consider iteratively input/output pairs of a training dataset. Interestingly, it appears that one does not need full supervision to access stochastic gradients, which is the main motivation of this paper. After formalizing the "active labeling" problem, which focuses on active learning with partial supervision, we provide a streaming technique that provably minimizes the ratio of generalization error over the number of samples. We illustrate our technique in depth for robust regression.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

viviencabannes/active-labeling
noneOfficial

Videos

Active Labeling: Streaming Stochastic Gradients· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Markov Chains and Monte Carlo Methods