Semi-supervised Batch Active Learning via Bilevel Optimization

Zal\'an Borsos; Marco Tagliasacchi; Andreas Krause

arXiv:2010.09654·cs.LG·October 20, 2020

Semi-supervised Batch Active Learning via Bilevel Optimization

Zal\'an Borsos, Marco Tagliasacchi, Andreas Krause

PDF

1 Repo

TL;DR

This paper introduces a semi-supervised batch active learning method using bilevel optimization to select data batches that effectively summarize unlabeled data, significantly reducing labeling costs especially in keyword detection with limited labels.

Contribution

It presents a novel bilevel optimization-based batch selection strategy for semi-supervised active learning, improving data efficiency in low-label regimes.

Findings

01

Effective in keyword detection tasks with few labels

02

Outperforms existing active learning methods

03

Reduces labeling effort significantly

Abstract

Active learning is an effective technique for reducing the labeling cost by improving data efficiency. In this work, we propose a novel batch acquisition strategy for active learning in the setting where the model training is performed in a semi-supervised manner. We formulate our approach as a data summarization problem via bilevel optimization, where the queried batch consists of the points that best summarize the unlabeled data pool. We show that our method is highly effective in keyword detection tasks in the regime when only few labeled samples are available.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zalanborsos/bilevel_coresets
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.