Yes, BM25 is a Strong Baseline for Legal Case Retrieval

Guilherme Moraes Rosa; Ruan Chaves Rodrigues; Roberto Lotufo; Rodrigo; Nogueira

arXiv:2105.05686·cs.IR·October 26, 2021·1 cites

Yes, BM25 is a Strong Baseline for Legal Case Retrieval

Guilherme Moraes Rosa, Ruan Chaves Rodrigues, Roberto Lotufo, Rodrigo, Nogueira

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that a simple BM25 retrieval method serves as a strong baseline for legal case retrieval tasks, achieving high performance in COLIEE 2021 competition.

Contribution

It shows that a vanilla BM25 approach can outperform many complex models in legal case retrieval benchmarks.

Findings

01

BM25 achieved second place in COLIEE 2021

02

BM25 outperformed median submissions

03

Code is publicly available for reproducibility

Abstract

We describe our single submission to task 1 of COLIEE 2021. Our vanilla BM25 got second place, well above the median of submissions. Code is available at https://github.com/neuralmind-ai/coliee.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

neuralmind-ai/coliee
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law · Topic Modeling · Natural Language Processing Techniques