Pre-trained Language Models for the Legal Domain: A Case Study on Indian   Law

Shounak Paul; Arpan Mandal; Pawan Goyal; Saptarshi Ghosh

arXiv:2209.06049·cs.CL·May 16, 2023·27 cites

Pre-trained Language Models for the Legal Domain: A Case Study on Indian Law

Shounak Paul, Arpan Mandal, Pawan Goyal, Saptarshi Ghosh

PDF

Open Access 1 Repo 4 Models

TL;DR

This paper explores pre-training and fine-tuning Transformer-based legal language models specifically on Indian legal texts, demonstrating improved performance across multiple legal NLP tasks and domains.

Contribution

It introduces Indian legal domain-specific pre-training of existing models and training a new model from scratch, enhancing cross-domain NLP performance.

Findings

01

Improved accuracy on Indian legal NLP tasks.

02

Enhanced performance on European and UK legal texts.

03

Effective explainability analysis of models.

Abstract

NLP in the legal domain has seen increasing success with the emergence of Transformer-based Pre-trained Language Models (PLMs) pre-trained on legal text. PLMs trained over European and US legal text are available publicly; however, legal text from other domains (countries), such as India, have a lot of distinguishing characteristics. With the rapidly increasing volume of Legal NLP applications in various countries, it has become necessary to pre-train such LMs over legal text of other countries as well. In this work, we attempt to investigate pre-training in the Indian legal domain. We re-train (continue pre-training) two popular legal PLMs, LegalBERT and CaseLawBERT, on Indian legal data, as well as train a model from scratch with a vocabulary based on Indian legal text. We apply these PLMs over three benchmark legal NLP tasks -- Legal Statute Identification from facts, Semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

law-ai/pretraining-bert
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law