Active Learning with Importance Sampling
Muni Sreenivas Pydi, Vishnu Suresh Lokhande

TL;DR
This paper introduces ALIS, an active learning algorithm using importance sampling, providing bounds on true loss and an optimal sampling distribution to improve label efficiency.
Contribution
It proposes a novel importance sampling-based active learning method with theoretical loss bounds and an optimal sampling strategy.
Findings
Derived upper bounds on true loss for probabilistic sampling
Proposed an optimal sampling distribution to minimize loss bounds
Demonstrated theoretical advantages of importance sampling in active learning
Abstract
We consider an active learning setting where the algorithm has access to a large pool of unlabeled data and a small pool of labeled data. In each iteration, the algorithm chooses few unlabeled data points and obtains their labels from an oracle. In this paper, we consider a probabilistic querying procedure to choose the points to be labeled. We propose an algorithm for Active Learning with Importance Sampling (ALIS), and derive upper bounds on the true loss incurred by the algorithm for any arbitrary probabilistic sampling procedure. Further, we propose an optimal sampling distribution that minimizes the upper bound on the true loss.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Experimental Learning in Engineering · Gaussian Processes and Bayesian Inference
