ReGal: A First Look at PPO-based Legal AI for Judgment Prediction and Summarization in India

Shubham Kumar Nigam; Tanuj Tyagi; Siddharth Shukla; Aditya Kumar Guru; Balaramamahanthi Deepak Patnaik; Danush Khanna; Noel Shallum; Kripabandhu Ghosh; Arnab Bhattacharya

arXiv:2512.18014·cs.CL·December 23, 2025

ReGal: A First Look at PPO-based Legal AI for Judgment Prediction and Summarization in India

Shubham Kumar Nigam, Tanuj Tyagi, Siddharth Shukla, Aditya Kumar Guru, Balaramamahanthi Deepak Patnaik, Danush Khanna, Noel Shallum, Kripabandhu Ghosh, Arnab Bhattacharya

PDF

Open Access 1 Video

TL;DR

This paper explores the application of reinforcement learning, specifically PPO, to legal AI tasks in India, focusing on judgment prediction and summarization, and discusses the challenges and potential of RL in legal NLP.

Contribution

Introduces ReGal, a reinforcement learning framework for legal AI that combines multi-task instruction tuning with RLAIF, addressing challenges in applying RL to legal texts.

Findings

01

RL-based legal AI underperforms on standard metrics

02

Highlights challenges like reward alignment and language complexity

03

Provides insights into RL's potential for legal reasoning tasks

Abstract

This paper presents an early exploration of reinforcement learning methodologies for legal AI in the Indian context. We introduce Reinforcement Learning-based Legal Reasoning (ReGal), a framework that integrates Multi-Task Instruction Tuning with Reinforcement Learning from AI Feedback (RLAIF) using Proximal Policy Optimization (PPO). Our approach is evaluated across two critical legal tasks: (i) Court Judgment Prediction and Explanation (CJPE), and (ii) Legal Document Summarization. Although the framework underperforms on standard evaluation metrics compared to supervised and proprietary models, it provides valuable insights into the challenges of applying RL to legal texts. These challenges include reward model alignment, legal language complexity, and domain-specific adaptation. Through empirical and qualitative analysis, we demonstrate how RL can be repurposed for high-stakes,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

ReGal: A First Look at PPO-based Legal AI for Judgment Prediction and Summarization in India· underline

Taxonomy

TopicsArtificial Intelligence in Law · Multi-Agent Systems and Negotiation · Topic Modeling