Yes, BM25 is a Strong Baseline for Legal Case Retrieval
Guilherme Moraes Rosa, Ruan Chaves Rodrigues, Roberto Lotufo, Rodrigo, Nogueira

TL;DR
This paper demonstrates that a simple BM25 retrieval method serves as a strong baseline for legal case retrieval tasks, achieving high performance in COLIEE 2021 competition.
Contribution
It shows that a vanilla BM25 approach can outperform many complex models in legal case retrieval benchmarks.
Findings
BM25 achieved second place in COLIEE 2021
BM25 outperformed median submissions
Code is publicly available for reproducibility
Abstract
We describe our single submission to task 1 of COLIEE 2021. Our vanilla BM25 got second place, well above the median of submissions. Code is available at https://github.com/neuralmind-ai/coliee.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Topic Modeling · Natural Language Processing Techniques
