Evaluating the Role of Large Language Models in Legal Practice in India
Rahul Hemrajani (National Law School of India University, Bengaluru)

TL;DR
This paper empirically evaluates the performance of large language models in Indian legal tasks, highlighting their strengths in drafting and issue spotting but limitations in specialized research and factual accuracy.
Contribution
It provides an empirical assessment of LLMs in Indian legal practice, comparing their outputs with junior lawyers and law students across key legal tasks.
Findings
LLMs excel in drafting and issue spotting
LLMs often generate hallucinations and inaccuracies
Human expertise remains crucial for nuanced legal reasoning
Abstract
The integration of Artificial Intelligence(AI) into the legal profession raises significant questions about the capacity of Large Language Models(LLM) to perform key legal tasks. In this paper, I empirically evaluate how well LLMs, such as GPT, Claude, and Llama, perform key legal tasks in the Indian context, including issue spotting, legal drafting, advice, research, and reasoning. Through a survey experiment, I compare outputs from LLMs with those of a junior lawyer, with advanced law students rating the work on helpfulness, accuracy, and comprehensiveness. LLMs excel in drafting and issue spotting, often matching or surpassing human work. However, they struggle with specialised legal research, frequently generating hallucinations, factually incorrect or fabricated outputs. I conclude that while LLMs can augment certain legal tasks, human expertise remains essential for nuanced…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
