When to Screen, When to Bypass: LLM-Judges in Resource-Scarce AI-Human Workflow
Ruihan Lin, Jiheng Zhang

TL;DR
This paper models the decision of when to use LLM-based judges in AI-human workflows to optimize throughput and resource utilization, revealing conditions where screening is beneficial or detrimental.
Contribution
It introduces a queueing model for LLM-judged workflows, deriving optimal routing and capacity policies, and demonstrates asymptotic optimality and practical improvements over simple heuristics.
Findings
Optimal judge allocation depends on bottleneck resource.
Screening amplifies human capacity when reviewers are scarce.
The proposed policy outperforms always screen or never screen benchmarks.
Abstract
AI systems can generate outputs at scale, but most outputs require human approval before release. This creates a bottleneck: humans cannot keep pace with AI-generated volume. A natural response is to insert an LLM-judge that screens outputs before they reach humans, filtering errors and amplifying effective review capacity. But judges are imperfect. False rejections send correct outputs back for unnecessary rework; false acceptances consume judge capacity without relieving humans. When should outputs be routed through the judge, and when should they bypass it directly to human review? We model this workflow as a queueing network with three resource pools and use a fluid approximation to characterize optimal judge allocation. The analysis reveals that optimal allocation depends critically on which resource is the current bottleneck: screening amplifies human capacity when reviewers are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Personal Information Management and User Behavior · Human-Automation Interaction and Safety
