Efficient Budget Allocation for Large-Scale LLM-Enabled Virtual   Screening

Zaile Li; Weiwei Fan; L. Jeff Hong

arXiv:2408.09537·stat.ML·April 28, 2025

Efficient Budget Allocation for Large-Scale LLM-Enabled Virtual Screening

Zaile Li, Weiwei Fan, L. Jeff Hong

PDF

Open Access

TL;DR

This paper introduces an efficient, sample-optimal algorithm for large-scale virtual screening using LLMs as evaluators, significantly reducing costs and improving ranking accuracy in decision-making tasks.

Contribution

It proposes the EFG-m algorithm for scalable, cost-effective virtual screening with LLMs, and proves its optimality and consistency in large-scale settings.

Findings

01

EFG-m algorithm is sample-optimal and consistent.

02

The approach induces an indifference-based ranking within selected subsets.

03

Numerical experiments validate the effectiveness of the algorithms.

Abstract

Screening tasks that aim to identify a small subset of top alternatives from a large pool are common in business decision-making processes. These tasks often require substantial human effort to evaluate each alternative's performance, making them time-consuming and costly. Motivated by recent advances in large language models (LLMs), particularly their ability to generate outputs that align well with human evaluations, we consider an LLM-as-human-evaluator approach for conducting screening virtually, thereby reducing the cost burden. To achieve scalability and cost-effectiveness in virtual screening, we identify that the stochastic nature of LLM outputs and their cost structure necessitate efficient budget allocation across all alternatives. To address this, we propose using a top- $m$ greedy evaluation mechanism, a simple yet effective approach that keeps evaluating the current top- $m$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFault Detection and Control Systems

MethodsALIGN · Sparse Evolutionary Training · Focus