Unsupervised Active Learning in Large Domains

Harald Steck; Tommi S. Jaakkola

arXiv:1301.0602·cs.LG·January 7, 2013

Unsupervised Active Learning in Large Domains

Harald Steck, Tommi S. Jaakkola

PDF

Open Access

TL;DR

This paper introduces a new surrogate measure for active learning in large domains, enabling effective query optimization with small committees, and demonstrates its utility in network model recovery.

Contribution

It proposes a novel surrogate measure for active learning that works efficiently with small committees and introduces a bootstrap method for committee selection.

Findings

01

The surrogate measure improves active learning efficiency with small committees.

02

The bootstrap approach enhances committee selection quality.

03

Application to network model recovery shows practical benefits.

Abstract

Active learning is a powerful approach to analyzing data effectively. We show that the feasibility of active learning depends crucially on the choice of measure with respect to which the query is being optimized. The standard information gain, for example, does not permit an accurate evaluation with a small committee, a representative subset of the model space. We propose a surrogate measure requiring only a small committee and discuss the properties of this new measure. We devise, in addition, a bootstrap approach for committee selection. The advantages of this approach are illustrated in the context of recovering (regulatory) network models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Data Stream Mining Techniques · Fault Detection and Control Systems