Does AI help humans make better decisions? A statistical evaluation framework for experimental and observational studies
Eli Ben-Michael, D. James Greiner, Melody Huang, Kosuke Imai, Zhichao, Jiang, Sooahn Shin

TL;DR
This paper introduces a new statistical framework to evaluate whether AI improves human decision-making, using minimal assumptions and applying it to a real trial involving risk assessment tools and language models.
Contribution
The paper develops a novel methodological framework for empirically comparing human, AI, and combined decision systems under randomized conditions.
Findings
AI recommendations did not improve classification accuracy in bail decisions.
Replacing judges with algorithms worsened decision performance.
The framework allows comparison of different decision-making systems with minimal assumptions.
Abstract
The use of Artificial Intelligence (AI), or more generally data-driven algorithms, has become ubiquitous in today's society. Yet, in many cases and especially when stakes are high, humans still make final decisions. The critical question, therefore, is whether AI helps humans make better decisions compared to a human-alone or AI-alone system. We introduce a new methodological framework to empirically answer this question with a minimal set of assumptions. We measure a decision maker's ability to make correct decisions using standard classification metrics based on the baseline potential outcome. We consider a single-blinded and unconfounded treatment assignment, where the provision of AI-generated recommendations is assumed to be randomized across cases with humans making final decisions. Under this study design, we show how to compare the performance of three alternative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI
MethodsSparse Evolutionary Training
