A Comprehensive Evaluation of Large Language Models on Legal Judgment   Prediction

Ruihao Shui; Yixin Cao; Xiang Wang; Tat-Seng Chua

arXiv:2310.11761·cs.CL·October 19, 2023·1 cites

A Comprehensive Evaluation of Large Language Models on Legal Judgment Prediction

Ruihao Shui, Yixin Cao, Xiang Wang, Tat-Seng Chua

PDF

Open Access 1 Repo

TL;DR

This paper systematically evaluates large language models on legal judgment prediction, demonstrating their capabilities, limitations, and the impact of information retrieval integration in legal reasoning tasks.

Contribution

It introduces practical baseline solutions for legal judgment prediction using LLMs and IR systems, and reveals insights into their performance and limitations in legal reasoning.

Findings

01

LLMs can effectively recall legal domain knowledge with case and multi-choice prompts.

02

IR systems can outperform LLM+IR combinations when LLMs are weak, making LLMs redundant in such scenarios.

03

The evaluation pipeline is adaptable for other tasks and domains.

Abstract

Large language models (LLMs) have demonstrated great potential for domain-specific applications, such as the law domain. However, recent disputes over GPT-4's law evaluation raise questions concerning their performance in real-world legal tasks. To systematically investigate their competency in the law, we design practical baseline solutions based on LLMs and test on the task of legal judgment prediction. In our solutions, LLMs can work alone to answer open questions or coordinate with an information retrieval (IR) system to learn from similar cases or solve simplified multi-choice questions. We show that similar cases and multi-choice options, namely label candidates, included in prompts can help LLMs recall domain knowledge that is critical for expertise legal reasoning. We additionally present an intriguing paradox wherein an IR system surpasses the performance of LLM+IR due to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

srhthu/lm-compeval-legal
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law · Topic Modeling · Natural Language Processing Techniques