IL-PCSR: Legal Corpus for Prior Case and Statute Retrieval

Shounak Paul; Dhananjay Ghumare; Pawan Goyal; Saptarshi Ghosh; Ashutosh Modi

arXiv:2511.00268·cs.CL·November 4, 2025

IL-PCSR: Legal Corpus for Prior Case and Statute Retrieval

Shounak Paul, Dhananjay Ghumare, Pawan Goyal, Saptarshi Ghosh, Ashutosh Modi

PDF

Open Access 1 Video

TL;DR

This paper introduces IL-PCSR, a comprehensive legal corpus for simultaneous retrieval of statutes and prior cases, and demonstrates models that leverage their inherent interdependence for improved legal information retrieval.

Contribution

It presents a unified corpus and models for both statute and precedent retrieval, exploiting their relatedness, which was not addressed in prior separate approaches.

Findings

01

LLM-based re-ranking significantly improves retrieval accuracy

02

Ensemble models outperform individual lexical and semantic models

03

The corpus enables joint development of related legal retrieval tasks

Abstract

Identifying/retrieving relevant statutes and prior cases/precedents for a given legal situation are common tasks exercised by law practitioners. Researchers to date have addressed the two tasks independently, thus developing completely different datasets and models for each task; however, both retrieval tasks are inherently related, e.g., similar cases tend to cite similar statutes (due to similar factual situation). In this paper, we address this gap. We propose IL-PCR (Indian Legal corpus for Prior Case and Statute Retrieval), which is a unique corpus that provides a common testbed for developing models for both the tasks (Statute Retrieval and Precedent Retrieval) that can exploit the dependence between the two. We experiment extensively with several baseline models on the tasks, including lexical models, semantic models and ensemble based on GNNs. Further, to exploit the dependence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

IL-PCSR: Legal Corpus for Prior Case and Statute Retrieval· underline

Taxonomy

TopicsArtificial Intelligence in Law · Multi-Agent Systems and Negotiation · Topic Modeling