Scaling Systematic Literature Reviews with Machine Learning Pipelines

Seraphina Goldfarb-Tarrant; Alexander Robertson; Jasmina Lazic,; Theodora Tsouloufi; Louise Donnison; Karen Smyth

arXiv:2010.04665·cs.CL·October 12, 2020

Scaling Systematic Literature Reviews with Machine Learning Pipelines

Seraphina Goldfarb-Tarrant, Alexander Robertson, Jasmina Lazic,, Theodora Tsouloufi, Louise Donnison, Karen Smyth

PDF

1 Repo

TL;DR

This paper presents a machine learning pipeline that automates key stages of systematic literature reviews, achieving high accuracy and generalizability with minimal expert annotation, significantly reducing review time.

Contribution

The authors develop and evaluate a comprehensive ML pipeline for automating systematic reviews, demonstrating effective automation of document retrieval, selection, and data extraction.

Findings

01

High accuracy achieved with only 2 weeks of annotation

02

Pipeline generalizes well to unseen data from different countries

03

Automation reduces review time by approximately 85%

Abstract

Systematic reviews, which entail the extraction of data from large numbers of scientific documents, are an ideal avenue for the application of machine learning. They are vital to many fields of science and philanthropy, but are very time-consuming and require experts. Yet the three main stages of a systematic review are easily done automatically: searching for documents can be done via APIs and scrapers, selection of relevant documents can be done via binary classification, and extraction of data can be done via sequence-labelling classification. Despite the promise of automation for this field, little research exists that examines the various ways to automate each of these tasks. We construct a pipeline that automates each of these aspects, and experiment with many human-time vs. system quality trade-offs. We test the ability of classifiers to work well on small amounts of data and to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

seraphinatarrant/systematic_reviews
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.