Evaluating Document Representations for Content-based Legal Literature   Recommendations

Malte Ostendorff; Elliott Ash; Terry Ruas; Bela Gipp; Julian; Moreno-Schneider; Georg Rehm

arXiv:2104.13841·cs.CL·April 29, 2021

Evaluating Document Representations for Content-based Legal Literature Recommendations

Malte Ostendorff, Elliott Ash, Terry Ruas, Bela Gipp, Julian, Moreno-Schneider, Georg Rehm

PDF

1 Repo

TL;DR

This study evaluates various document representation methods for legal literature recommendation, finding that combined fastText and Poincaré embeddings perform best in retrieving related US case law, with open datasets enhancing reproducibility.

Contribution

It introduces a comprehensive evaluation of 27 state-of-the-art document representations for legal literature retrieval using newly created open benchmark datasets.

Findings

01

Averaged fastText vectors perform best among text-based methods.

02

Poincaré citation embeddings are highly effective.

03

Hybrid approaches improve retrieval performance.

Abstract

Recommender systems assist legal professionals in finding relevant literature for supporting their case. Despite its importance for the profession, legal applications do not reflect the latest advances in recommender systems and representation learning research. Simultaneously, legal recommender systems are typically evaluated in small-scale user study without any public available benchmark datasets. Thus, these studies have limited reproducibility. To address the gap between research and practice, we explore a set of state-of-the-art document representation methods for the task of retrieving semantically related US case law. We evaluate text-based (e.g., fastText, Transformers), citation-based (e.g., DeepWalk, Poincar\'e), and hybrid methods. We compare in total 27 methods using two silver standards with annotations for 2,964 documents. The silver standards are newly created from Open…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

malteos/legal-document-similarity
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDeepWalk · fastText