Using Chao's Estimator as a Stopping Criterion for Technology-Assisted Review
Michiel P. Bron, Peter G. M. van der Heijden, Ad J. Feelders, Arno P., J. M. Siebes

TL;DR
This paper introduces a new stopping criterion for Technology-Assisted Review using Chao's Estimator, aiming to optimize the review process by accurately estimating relevant document prevalence and reducing missed relevant documents.
Contribution
The paper proposes a novel ensemble-based active learning strategy and a stopping criterion based on Chao's Estimator, improving review efficiency and accuracy.
Findings
The Chao's Estimator-based criterion performs well across multiple datasets.
It effectively balances the number of relevant documents missed and irrelevant documents read.
Compared to existing methods, it offers improved stopping decisions.
Abstract
Technology-Assisted Review (TAR) aims to reduce the human effort required for screening processes such as abstract screening for systematic literature reviews. Human reviewers label documents as relevant or irrelevant during this process, while the system incrementally updates a prediction model based on the reviewers' previous decisions. After each model update, the system proposes new documents it deems relevant, to prioritize relevant documentsover irrelevant ones. A stopping criterion is necessary to guide users in stopping the review process to minimize the number of missed relevant documents and the number of read irrelevant documents. In this paper, we propose and evaluate a new ensemble-based Active Learning strategy and a stopping criterion based on Chao's Population Size Estimator that estimates the prevalence of relevant documents in the dataset. Our simulation study…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDelphi Technique in Research
