A Data Science Approach to Calcutta High Court Judgments: An Efficient LLM and RAG-powered Framework for Summarization and Similar Cases Retrieval
Puspendu Banerjee, Aritra Mazumdar, Wazib Ansar, Saptarsi Goswami, Amlan Chakrabarti

TL;DR
This paper introduces a novel framework combining Large Language Models and Retrieval-Augmented Generation to enhance summarization and case retrieval in Calcutta High Court judgments, improving legal research efficiency.
Contribution
It presents a new LLM and RAG-based system for legal case summarization and retrieval, with fine-tuned models and a vector database tailored for legal texts.
Findings
Significant improvement in legal case summarization quality.
Efficient retrieval of similar cases using RAG techniques.
Enhanced support for legal professionals and students.
Abstract
The judiciary, as one of democracy's three pillars, is dealing with a rising amount of legal issues, needing careful use of judicial resources. This research presents a complex framework that leverages Data Science methodologies, notably Large Language Models (LLM) and Retrieval-Augmented Generation (RAG) techniques, to improve the efficiency of analyzing Calcutta High Court verdicts. Our framework focuses on two key aspects: first, the creation of a robust summarization mechanism that distills complex legal texts into concise and coherent summaries; and second, the development of an intelligent system for retrieving similar cases, which will assist legal professionals in research and decision making. By fine-tuning the Pegasus model using case head note summaries, we achieve significant improvements in the summarization of legal cases. Our two-step summarizing technique preserves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Multi-Agent Systems and Negotiation · Computational and Text Analysis Methods
