Targeted active learning for probabilistic models
Christopher Tosh, Mauricio Tec, Wesley Tansey

TL;DR
PDBAL is a targeted active learning method that adaptively designs experiments to maximize scientific utility, efficiently identifying high-value outcomes with fewer experiments.
Contribution
The paper introduces PDBAL, a novel active learning approach that incorporates user-defined utility functions and provides theoretical and practical advantages over standard methods.
Findings
PDBAL outperforms untargeted approaches in simulations.
Theoretical bounds on label complexity are established.
PDBAL efficiently identifies effective drugs in cancer screening data.
Abstract
A fundamental task in science is to design experiments that yield valuable insights about the system under study. Mathematically, these insights can be represented as a utility or risk function that shapes the value of conducting each experiment. We present PDBAL, a targeted active learning method that adaptively designs experiments to maximize scientific utility. PDBAL takes a user-specified risk function and combines it with a probabilistic model of the experimental outcomes to choose designs that rapidly converge on a high-utility model. We prove theoretical bounds on the label complexity of PDBAL and provide fast closed-form solutions for designing experiments with common exponential family likelihoods. In simulation studies, PDBAL consistently outperforms standard untargeted approaches that focus on maximizing expected information gain over the design space. Finally, we demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Computational Drug Discovery Methods
