Human-AI collectives produce the most accurate differential diagnoses

N. Z\"oller; J. Berger; I. Lin; N. Fu; J. Komarneni; G. Barabucci; K.; Laskowski; V. Shia; B. Harack; E. A. Chu; V. Trianni; R. H.J.M. Kurvers; S.; M. Herzog

arXiv:2406.14981·cs.AI·June 24, 2024

Human-AI collectives produce the most accurate differential diagnoses

N. Z\"oller, J. Berger, I. Lin, N. Fu, J. Komarneni, G. Barabucci, K., Laskowski, V. Shia, B. Harack, E. A. Chu, V. Trianni, R. H.J.M. Kurvers, S., M. Herzog

PDF

1 Repo

TL;DR

This paper demonstrates that combining human physicians with large language models in a hybrid collective significantly improves diagnostic accuracy in complex medical cases compared to individual or collective human or AI diagnoses alone.

Contribution

The study introduces a hybrid human-AI collective approach that leverages the complementary strengths of physicians and LLMs to enhance diagnostic accuracy in medicine.

Findings

01

Hybrid collectives outperform individual physicians and LLMs in accuracy.

02

The approach is effective across multiple medical specialties.

03

Combining human and AI diagnoses reduces different types of errors.

Abstract

Artificial intelligence systems, particularly large language models (LLMs), are increasingly being employed in high-stakes decisions that impact both individuals and society at large, often without adequate safeguards to ensure safety, quality, and equity. Yet LLMs hallucinate, lack common sense, and are biased - shortcomings that may reflect LLMs' inherent limitations and thus may not be remedied by more sophisticated architectures, more data, or more human feedback. Relying solely on LLMs for complex, high-stakes decisions is therefore problematic. Here we present a hybrid collective intelligence system that mitigates these risks by leveraging the complementary strengths of human experience and the vast information processed by LLMs. We apply our method to open-ended medical diagnostics, combining 40,762 differential diagnoses made by physicians with the diagnoses of five state-of-the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nikozoe/human_ai_collectives
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.