Advantages of Domain Knowledge Injection for Legal Document Summarization: A Case Study on Summarizing Indian Court Judgments in English and Hindi
Debtanu Datta, Rajdeep Mukherjee, Adrijit Goswami, Saptarshi Ghosh

TL;DR
This paper demonstrates that injecting legal domain knowledge into neural summarization models significantly improves the quality and factual accuracy of summaries for Indian court judgments in both English and Hindi.
Contribution
The study introduces a framework for incorporating legal domain knowledge into extractive and generative summarization models, enhancing Indian legal document summarization in multiple languages.
Findings
Significant improvements in summarization quality metrics
Enhanced factual consistency and legal relevance
Validated effectiveness through domain expert evaluations
Abstract
Summarizing Indian legal court judgments is a complex task not only due to the intricate language and unstructured nature of the legal texts, but also since a large section of the Indian population does not understand the complex English in which legal text is written, thus requiring summaries in Indian languages. In this study, we aim to improve the summarization of Indian legal text to generate summaries in both English and Hindi (the most widely spoken Indian language), by injecting domain knowledge into diverse summarization models. We propose a framework to enhance extractive neural summarization models by incorporating domain-specific pre-trained encoders tailored for legal texts. Further, we explore the injection of legal domain knowledge into generative models (including Large Language Models) through continual pre-training on large legal corpora in English and Hindi. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Law · Text Readability and Simplification
