Adaptive Sampling for Discovery
Ziping Xu, Eunjae Shim, Ambuj Tewari, Paul Zimmerman

TL;DR
This paper introduces the Adaptive Sampling for Discovery (ASD) problem, proposing an information-directed sampling algorithm with theoretical guarantees, demonstrated through simulations and real-world chemical discovery experiments.
Contribution
It formulates ASD rigorously and develops a general IDS algorithm with proven performance guarantees across multiple models.
Findings
IDS outperforms baseline methods in simulations
IDS achieves significant improvements in chemical discovery tasks
Theoretical guarantees validate IDS effectiveness
Abstract
In this paper, we study a sequential decision-making problem, called Adaptive Sampling for Discovery (ASD). Starting with a large unlabeled dataset, algorithms for ASD adaptively label the points with the goal to maximize the sum of responses. This problem has wide applications to real-world discovery problems, for example drug discovery with the help of machine learning models. ASD algorithms face the well-known exploration-exploitation dilemma. The algorithm needs to choose points that yield information to improve model estimates but it also needs to exploit the model. We rigorously formulate the problem and propose a general information-directed sampling (IDS) algorithm. We provide theoretical guarantees for the performance of IDS in linear, graph and low-rank models. The benefits of IDS are shown in both simulation experiments and real-data experiments for discovering chemical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Computational Drug Discovery Methods · Analytical Chemistry and Chromatography
