Indian Legal Text Summarization: A Text Normalisation-based Approach
Satyajit Ghosh, Mousumi Dutta, Tanaya Das

TL;DR
This paper introduces a text normalisation method to enhance domain-independent legal text summarization models for the Indian legal system, addressing the challenge of limited datasets and improving summarization quality.
Contribution
The authors propose a novel text normalisation approach tailored for Indian legal texts to improve the performance of existing summarization models like BART and PEGASUS.
Findings
Normalisation improves summarization quality for legal texts
Domain experts rate the summaries as more accurate after normalisation
ROUGE scores indicate significant enhancement with the proposed method
Abstract
In the Indian court system, pending cases have long been a problem. There are more than 4 crore cases outstanding. Manually summarising hundreds of documents is a time-consuming and tedious task for legal stakeholders. Many state-of-the-art models for text summarization have emerged as machine learning has progressed. Domain-independent models don't do well with legal texts, and fine-tuning those models for the Indian Legal System is problematic due to a lack of publicly available datasets. To improve the performance of domain-independent models, the authors have proposed a methodology for normalising legal texts in the Indian context. The authors experimented with two state-of-the-art domain-independent models for legal text summarization, namely BART and PEGASUS. BART and PEGASUS are put through their paces in terms of extractive and abstractive summarization to understand the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Natural Language Processing Techniques · Topic Modeling
MethodsMulti-Head Attention · Attention Is All You Need · PEGASUS · Linear Layer · Dropout · Adam · Byte Pair Encoding · Residual Connection · Dense Connections · Layer Normalization
