IL-PCSR: Legal Corpus for Prior Case and Statute Retrieval
Shounak Paul, Dhananjay Ghumare, Pawan Goyal, Saptarshi Ghosh, Ashutosh Modi

TL;DR
This paper introduces IL-PCSR, a comprehensive legal corpus for simultaneous retrieval of statutes and prior cases, and demonstrates models that leverage their inherent interdependence for improved legal information retrieval.
Contribution
It presents a unified corpus and models for both statute and precedent retrieval, exploiting their relatedness, which was not addressed in prior separate approaches.
Findings
LLM-based re-ranking significantly improves retrieval accuracy
Ensemble models outperform individual lexical and semantic models
The corpus enables joint development of related legal retrieval tasks
Abstract
Identifying/retrieving relevant statutes and prior cases/precedents for a given legal situation are common tasks exercised by law practitioners. Researchers to date have addressed the two tasks independently, thus developing completely different datasets and models for each task; however, both retrieval tasks are inherently related, e.g., similar cases tend to cite similar statutes (due to similar factual situation). In this paper, we address this gap. We propose IL-PCR (Indian Legal corpus for Prior Case and Statute Retrieval), which is a unique corpus that provides a common testbed for developing models for both the tasks (Statute Retrieval and Precedent Retrieval) that can exploit the dependence between the two. We experiment extensively with several baseline models on the tasks, including lexical models, semantic models and ensemble based on GNNs. Further, to exploit the dependence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsArtificial Intelligence in Law · Multi-Agent Systems and Negotiation · Topic Modeling
