Autonomy and Reliability of Continuous Active Learning for   Technology-Assisted Review

Gordon V. Cormack; Maura R. Grossman

arXiv:1504.06868·cs.IR·April 28, 2015·66 cites

Autonomy and Reliability of Continuous Active Learning for Technology-Assisted Review

Gordon V. Cormack, Maura R. Grossman

PDF

Open Access

TL;DR

This paper improves the autonomy and reliability of continuous active learning for technology-assisted review by removing tuning parameters and demonstrating superior performance across multiple datasets and tasks.

Contribution

The authors enhance continuous active learning by eliminating topic-specific tuning, making it more autonomous and effective for various document review tasks.

Findings

01

Consistently outperforms previous methods on multiple datasets

02

Requires minimal user input, only an initial query or relevant document

03

Achieves superior results across legal, news, and filtering tasks

Abstract

We enhance the autonomy of the continuous active learning method shown by Cormack and Grossman (SIGIR 2014) to be effective for technology-assisted review, in which documents from a collection are retrieved and reviewed, using relevance feedback, until substantially all of the relevant documents have been reviewed. Autonomy is enhanced through the elimination of topic-specific and dataset-specific tuning parameters, so that the sole input required by the user is, at the outset, a short query, topic description, or single relevant document; and, throughout the review, ongoing relevance assessments of the retrieved documents. We show that our enhancements consistently yield superior results to Cormack and Grossman's version of continuous active learning, and other methods, not only on average, but on the vast majority of topics from four separate sets of tasks: the legal datasets examined…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Software Engineering Research