Technology Assisted Reviews: Finding the Last Few Relevant Documents by   Asking Yes/No Questions to Reviewers

Jie Zou; Dan Li; Evangelos Kanoulas

arXiv:1810.05414·cs.IR·October 15, 2018·1 cites

Technology Assisted Reviews: Finding the Last Few Relevant Documents by Asking Yes/No Questions to Reviewers

Jie Zou, Dan Li, Evangelos Kanoulas

PDF

Open Access 1 Repo

TL;DR

This paper introduces a Bayesian questioning method to efficiently find the last relevant documents in a collection, reducing human effort in technology-assisted reviews.

Contribution

It proposes a novel sequential Bayesian search approach that uses yes/no questions about entities to identify remaining relevant documents more efficiently.

Findings

01

Significantly reduces review effort for last relevant documents

02

Outperforms existing active learning methods in experiments

03

Demonstrates effectiveness in real-world datasets

Abstract

The goal of a technology-assisted review is to achieve high recall with low human effort. Continuous active learning algorithms have demonstrated good performance in locating the majority of relevant documents in a collection, however their performance is reaching a plateau when 80\%-90\% of them has been found. Finding the last few relevant documents typically requires exhaustively reviewing the collection. In this paper, we propose a novel method to identify these last few, but significant, documents efficiently. Our method makes the hypothesis that entities carry vital information in documents, and that reviewers can answer questions about the presence or absence of an entity in the missing relevance documents. Based on this we devise a sequential Bayesian search method that selects the optimal sequence of questions to ask. The experimental results show that our proposed method can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiezou0806/SBSTAR
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Topic Modeling · Text and Document Classification Technologies