Cascaded Language Models for Cost-effective Human-AI Decision-Making
Claudio Fanconi, Mihaela van der Schaar

TL;DR
This paper introduces a cascaded large language model framework that adaptively balances prediction accuracy, cost, and abstention, using online learning and human feedback to improve decision-making in question-answering tasks.
Contribution
It proposes a novel cascaded decision framework with adaptive deferral and abstention policies, incorporating online learning to optimize human-AI collaboration.
Findings
Outperforms single-model baselines in accuracy and cost-efficiency.
Effective in general and medical question-answering tasks.
Reduces reliance on costly human intervention while maintaining high confidence.
Abstract
A challenge in human-AI decision-making is to balance three factors: the correctness of predictions, the cost of knowledge and reasoning complexity, and the confidence about whether to abstain from automated answers or escalate to human experts. In this work, we present a cascaded LLM decision framework that adaptively delegates tasks across multiple tiers of expertise -- a base model for initial candidate answers, a more capable and knowledgeable (but costlier) large model, and a human expert for when the model cascade abstains. Our method proceeds in two stages. First, a deferral policy determines whether to accept the base model's answer or regenerate it with the large model based on the confidence score. Second, an abstention policy decides whether the cascade model response is sufficiently certain or requires human intervention. Moreover, to overcome static policies and accommodate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI · Human-Automation Interaction and Safety
