Corpus for Automatic Structuring of Legal Documents

Prathamesh Kalamkar; Aman Tiwari; Astha Agarwal; Saurabh Karn; and Smita Gupta; Vivek Raghavan; Ashutosh Modi

arXiv:2201.13125·cs.CL·September 20, 2022·39 cites

Corpus for Automatic Structuring of Legal Documents

Prathamesh Kalamkar, Aman Tiwari, Astha Agarwal, Saurabh Karn, and Smita Gupta, Vivek Raghavan, Ashutosh Modi

PDF

Open Access 1 Models 1 Datasets

TL;DR

This paper introduces a new annotated corpus of legal judgment documents in English, with rhetorical role labels, to facilitate automatic structuring, summarization, and prediction tasks in legal NLP.

Contribution

It presents a novel corpus with rhetorical role annotations for legal documents and baseline models for role prediction, aiding legal document processing.

Findings

01

Corpus enables better legal document organization

02

Baseline models improve role prediction accuracy

03

Application enhances summarization and judgment prediction

Abstract

In populous countries, pending legal cases have been growing exponentially. There is a need for developing techniques for processing and organizing legal documents. In this paper, we introduce a new corpus for structuring legal documents. In particular, we introduce a corpus of legal judgment documents in English that are segmented into topical and coherent parts. Each of these parts is annotated with a label coming from a list of pre-defined Rhetorical Roles. We develop baseline models for automatically predicting rhetorical roles in a legal document based on the annotated corpus. Further, we show the application of rhetorical roles to improve performance on the tasks of summarization and legal judgment prediction. We release the corpus and baseline model code along with the paper.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
opennyaiorg/InRhetoricalRoles
model

Datasets

opennyaiorg/InRhetoricalRoles
dataset· 61 dl
61 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law · Natural Language Processing Techniques · Multi-Agent Systems and Negotiation