Large Language Models with Human-In-The-Loop Validation for Systematic   Review Data Extraction

Noah L. Schroeder; Chris Davis Jaldi; Shan Zhang

arXiv:2501.11840·cs.HC·January 22, 2025·3 cites

Large Language Models with Human-In-The-Loop Validation for Systematic Review Data Extraction

Noah L. Schroeder, Chris Davis Jaldi, Shan Zhang

PDF

Open Access

TL;DR

This study evaluates the use of large language models for automating data extraction in systematic reviews, demonstrating promising accuracy but emphasizing the importance of human oversight, and introduces an open-source tool for this purpose.

Contribution

It introduces a human-in-the-loop framework and an open-source tool (AIDE) to improve LLM-based data extraction for systematic reviews.

Findings

01

LLMs achieved over 62% consistency with human coding

02

Performance varied across different LLMs tested

03

Human-in-the-loop process is essential for reliable data extraction

Abstract

Systematic reviews are time-consuming endeavors. Historically speaking, knowledgeable humans have had to screen and extract data from studies before it can be analyzed. However, large language models (LLMs) hold promise to greatly accelerate this process. After a pilot study which showed great promise, we investigated the use of freely available LLMs for extracting data for systematic reviews. Using three different LLMs, we extracted 24 types of data, 9 explicitly stated variables and 15 derived categorical variables, from 112 studies that were included in a published scoping review. Overall we found that Gemini 1.5 Flash, Gemini 1.5 Pro, and Mistral Large 2 performed reasonably well, with 71.17%, 72.14%, and 62.43% of data extracted being consistent with human coding, respectively. While promising, these results highlight the dire need for a human-in-the-loop (HIL) process for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling