Does AI help humans make better decisions? A statistical evaluation   framework for experimental and observational studies

Eli Ben-Michael; D. James Greiner; Melody Huang; Kosuke Imai; Zhichao; Jiang; Sooahn Shin

arXiv:2403.12108·cs.AI·October 15, 2024·2 cites

Does AI help humans make better decisions? A statistical evaluation framework for experimental and observational studies

Eli Ben-Michael, D. James Greiner, Melody Huang, Kosuke Imai, Zhichao, Jiang, Sooahn Shin

PDF

Open Access

TL;DR

This paper introduces a new statistical framework to evaluate whether AI improves human decision-making, using minimal assumptions and applying it to a real trial involving risk assessment tools and language models.

Contribution

The paper develops a novel methodological framework for empirically comparing human, AI, and combined decision systems under randomized conditions.

Findings

01

AI recommendations did not improve classification accuracy in bail decisions.

02

Replacing judges with algorithms worsened decision performance.

03

The framework allows comparison of different decision-making systems with minimal assumptions.

Abstract

The use of Artificial Intelligence (AI), or more generally data-driven algorithms, has become ubiquitous in today's society. Yet, in many cases and especially when stakes are high, humans still make final decisions. The critical question, therefore, is whether AI helps humans make better decisions compared to a human-alone or AI-alone system. We introduce a new methodological framework to empirically answer this question with a minimal set of assumptions. We measure a decision maker's ability to make correct decisions using standard classification metrics based on the baseline potential outcome. We consider a single-blinded and unconfounded treatment assignment, where the provision of AI-generated recommendations is assumed to be randomized across cases with humans making final decisions. Under this study design, we show how to compare the performance of three alternative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI

MethodsSparse Evolutionary Training