Learning When Not to Decide: A Framework for Overcoming Factual Presumptuousness in AI Adjudication
Mohamed Afane, Emily Robitschek, Derek Ouyang, Daniel E. Ho

TL;DR
This paper addresses AI presumptuousness in legal decision-making by developing a framework that improves accuracy and appropriately defers decisions when evidence is incomplete, enhancing reliability in high-stakes settings.
Contribution
It introduces SPEC, a structured prompting framework that significantly reduces presumptuousness and improves decision accuracy in legal AI adjudication.
Findings
Standard AI approaches achieve only 15% accuracy with insufficient information.
Advanced prompting improves accuracy but often over-corrects, leading to unnecessary deferrals.
SPEC achieves 89% overall accuracy and appropriately defers when evidence is lacking.
Abstract
A well-known limitation of AI systems is presumptuousness: the tendency of AI systems to provide confident answers when information may be lacking. This challenge is particularly acute in legal applications, where a core task for attorneys, judges, and administrators is to determine whether evidence is sufficient to reach a conclusion. We study this problem in the important setting of unemployment insurance adjudication, which has seen rapid integration of AI systems and where the question of additional fact-finding poses the most significant bottleneck for a system that affects millions of applicants annually. First, through a collaboration with the Colorado Department of Labor and Employment, we secure rare access to official training materials and guidance to design a novel benchmark that systematically varies in information completeness. Second, we evaluate four leading AI platforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
