Structured Definitions and Segmentations for Legal Reasoning in LLMs: A Study on Indian Legal Data

Mann Khatri; Mirza Yusuf; Rajiv Ratn Shah; and Ponnurangam Kumaraguru

arXiv:2511.20669·cs.CL·November 27, 2025

Structured Definitions and Segmentations for Legal Reasoning in LLMs: A Study on Indian Legal Data

Mann Khatri, Mirza Yusuf, Rajiv Ratn Shah, and Ponnurangam Kumaraguru

PDF

Open Access

TL;DR

This paper investigates how structuring legal documents and defining legal terms can improve the reasoning and prediction capabilities of Large Language Models on Indian legal data, demonstrating notable performance gains.

Contribution

It introduces structured document organization and rhetorical role definitions as novel methods to enhance LLM understanding in legal tasks, especially in zero-shot settings.

Findings

01

Organizing legal documents improves model performance by up to 4.36% F1 score.

02

Defining legal terms significantly boosts model accuracy.

03

Structured approaches aid in processing long, complex legal texts.

Abstract

Large Language Models (LLMs), trained on extensive datasets from the web, exhibit remarkable general reasoning skills. Despite this, they often struggle in specialized areas like law, mainly because they lack domain-specific pretraining. The legal field presents unique challenges, as legal documents are generally long and intricate, making it hard for models to process the full text efficiently. Previous studies have examined in-context approaches to address the knowledge gap, boosting model performance in new domains without full domain alignment. In our paper, we analyze model behavior on legal tasks by conducting experiments in three areas: (i) reorganizing documents based on rhetorical roles to assess how structured information affects long context processing and model decisions, (ii) defining rhetorical roles to familiarize the model with legal terminology, and (iii) emulating the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law · Topic Modeling · Multi-Agent Systems and Negotiation