Empirical Evaluations of Active Learning Strategies in Legal Document   Review

Rishi Chhatwal; Nathaniel Huber-Fliflet; Robert Keeling; Jianping; Zhang; Haozhen Zhao

arXiv:1904.01719·cs.IR·April 4, 2019

Empirical Evaluations of Active Learning Strategies in Legal Document Review

Rishi Chhatwal, Nathaniel Huber-Fliflet, Robert Keeling, Jianping, Zhang, Haozhen Zhao

PDF

TL;DR

This study empirically evaluates active learning strategies in legal document review, revealing that the most popular approach quickly identifies key documents but becomes less efficient over time, suggesting alternative strategies may be more effective.

Contribution

It provides real-world experimental insights into the effectiveness of active learning in legal document review, challenging assumptions about its superiority and proposing tailored strategies.

Findings

01

Popular active learning methods lose efficiency over time

02

Most effective initial strategies differ from ongoing review methods

03

Large, real-world legal datasets used for evaluation

Abstract

One type of machine learning, text classification, is now regularly applied in the legal matters involving voluminous document populations because it can reduce the time and expense associated with the review of those documents. One form of machine learning - Active Learning - has drawn attention from the legal community because it offers the potential to make the machine learning process even more effective. Active Learning, applied to legal documents, is considered a new technology in the legal domain and is continuously applied to all documents in a legal matter until an insignificant number of relevant documents are left for review. This implementation is slightly different than traditional implementations of Active Learning where the process stops once achieving acceptable model performance. The purpose of this paper is twofold: (i) to question whether Active Learning actually is a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.