Heuristic Stopping Rules For Technology-Assisted Review

Eugene Yang; David D. Lewis; Ophir Frieder

arXiv:2106.09871·cs.IR·June 21, 2021

Heuristic Stopping Rules For Technology-Assisted Review

Eugene Yang, David D. Lewis, Ophir Frieder

PDF

TL;DR

This paper introduces two new heuristic stopping rules, Quant and QuantCI, for technology-assisted review workflows, which accurately meet recall targets while significantly reducing review costs across various tasks.

Contribution

The paper proposes novel model-based heuristic stopping rules, Quant and QuantCI, and demonstrates their effectiveness in improving cost efficiency in TAR workflows.

Findings

01

Quant and QuantCI accurately hit recall targets

02

They substantially reduce review costs

03

Effective across diverse tasks and recall levels

Abstract

Technology-assisted review (TAR) refers to human-in-the-loop active learning workflows for finding relevant documents in large collections. These workflows often must meet a target for the proportion of relevant documents found (i.e. recall) while also holding down costs. A variety of heuristic stopping rules have been suggested for striking this tradeoff in particular settings, but none have been tested against a range of recall targets and tasks. We propose two new heuristic stopping rules, Quant and QuantCI based on model-based estimation techniques from survey research. We compare them against a range of proposed heuristics and find they are accurate at hitting a range of recall targets while substantially reducing review costs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.