Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design
Quan Nguyen, Adji Bousso Dieng

TL;DR
This paper introduces quality-weighted Vendi scores to improve diversity in experimental design, enabling more effective exploration and discovery across scientific applications.
Contribution
It extends Vendi scores to incorporate quality, providing a novel metric that balances diversity and quality in experimental design tasks.
Findings
Achieved 70%-170% more effective discoveries than baselines.
Enabled flexible policies for balancing quality and diversity.
Applied successfully to drug, materials discovery, and reinforcement learning.
Abstract
Experimental design techniques such as active search and Bayesian optimization are widely used in the natural sciences for data collection and discovery. However, existing techniques tend to favor exploitation over exploration of the search space, which causes them to get stuck in local optima. This ``collapse" problem prevents experimental design algorithms from yielding diverse high-quality data. In this paper, we extend the Vendi scores -- a family of interpretable similarity-based diversity metrics -- to account for quality. We then leverage these quality-weighted Vendi scores to tackle experimental design problems across various applications, including drug discovery, materials discovery, and reinforcement learning. We found that quality-weighted Vendi scores allow us to construct policies for experimental design that flexibly balance quality and diversity, and ultimately assemble…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsManufacturing Process and Optimization · Optimal Experimental Design Methods
