Fine-tuning Large Language Models for Improving Factuality in Legal   Question Answering

Yinghao Hu; Leilei Gan; Wenyi Xiao; Kun Kuang; Fei Wu

arXiv:2501.06521·cs.CL·January 14, 2025

Fine-tuning Large Language Models for Improving Factuality in Legal Question Answering

Yinghao Hu, Leilei Gan, Wenyi Xiao, Kun Kuang, Fei Wu

PDF

1 Repo

TL;DR

This paper introduces a new benchmark and metrics for evaluating hallucinations in legal question answering by large language models, and proposes a novel mitigation method that significantly reduces hallucination rates.

Contribution

It presents a new benchmark, metrics, and a hallucination mitigation technique combining behavior cloning and HIPO for legal LLM QA.

Findings

01

Significant improvements in hallucination-related metrics.

02

Effective reduction in hallucination rates in legal QA.

03

Enhanced answer factuality and relevance.

Abstract

Hallucination, or the generation of incorrect or fabricated information, remains a critical challenge in large language models (LLMs), particularly in high-stake domains such as legal question answering (QA). In order to mitigate the hallucination rate in legal QA, we first introduce a benchmark called LegalHalBench and three automatic metrics to evaluate the common hallucinations when LLMs answer legal questions. We then propose a hallucination mitigation method that integrates behavior cloning and a novel Hard Sample-aware Iterative Direct Preference Optimization (HIPO). We conduct extensive real-data experiments to validate the effectiveness of our approach. Our results demonstrate remarkable improvements in various metrics, including the newly proposed Non-Hallucinated Statute Rate, Statute Relevance Rate, Legal Claim Truthfulness, as well as traditional metrics such as METEOR,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yinghaohu/legalhalbench
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.