From Weakly Supervised Learning to Active Learning

Vivien Cabannes

arXiv:2209.11629·cs.LG·September 26, 2022·1 cites

From Weakly Supervised Learning to Active Learning

Vivien Cabannes

PDF

Open Access

TL;DR

This paper explores a unified framework for weakly supervised and active learning, proposing methods to disambiguate partial labels, incorporate unsupervised techniques, and efficiently query data, aiming to reduce data annotation efforts.

Contribution

It introduces a novel weak supervision model with set-based targets, a scalable manifold regularization algorithm, and an active learning framework that minimizes annotation needs.

Findings

01

Proposed a set-based weak supervision model for ambiguous labels.

02

Developed a scalable diffusion-based manifold regularization algorithm.

03

Introduced an active learning approach that reduces annotation effort.

Abstract

Applied mathematics and machine computations have raised a lot of hope since the recent success of supervised learning. Many practitioners in industries have been trying to switch from their old paradigms to machine learning. Interestingly, those data scientists spend more time scrapping, annotating and cleaning data than fine-tuning models. This thesis is motivated by the following question: can we derive a more generic framework than the one of supervised learning in order to learn from clutter data? This question is approached through the lens of weakly supervised learning, assuming that the bottleneck of data collection lies in annotation. We model weak supervision as giving, rather than a unique target, a set of target candidates. We argue that one should look for an ``optimistic'' function that matches most of the observations. This allows us to derive a principle to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Gaussian Processes and Bayesian Inference

MethodsDiffusion