Pre-training, Fine-tuning and Re-ranking: A Three-Stage Framework for   Legal Question Answering

Shiwen Ni; Hao Cheng; Min Yang

arXiv:2412.19482·cs.CL·December 30, 2024

Pre-training, Fine-tuning and Re-ranking: A Three-Stage Framework for Legal Question Answering

Shiwen Ni, Hao Cheng, Min Yang

PDF

Open Access

TL;DR

This paper introduces a three-stage framework combining pre-training, fine-tuning, and re-ranking to improve legal question answering by enhancing domain-specific text representations and retrieval accuracy.

Contribution

It proposes a novel three-stage framework (PFR-LQA) that leverages domain-specific pre-training, task fine-tuning, and contextual re-ranking to outperform existing legal QA methods.

Findings

01

Outperforms strong competitors on legal QA datasets.

02

Enhances dense retrieval with domain-specific pre-training.

03

Improves question re-ranking through contextual similarity.

Abstract

Legal question answering (QA) has attracted increasing attention from people seeking legal advice, which aims to retrieve the most applicable answers from a large-scale database of question-answer pairs. Previous methods mainly use a dual-encoder architecture to learn dense representations of both questions and answers. However, these methods could suffer from lacking domain knowledge and sufficient labeled training data. In this paper, we propose a three-stage (\underline{p}re-training, \underline{f}ine-tuning and \underline{r}e-ranking) framework for \underline{l}egal \underline{QA} (called PFR-LQA), which promotes the fine-grained text representation learning and boosts the performance of dense retrieval with the dual-encoder architecture. Concretely, we first conduct domain-specific pre-training on legal questions and answers through a self-supervised training objective, allowing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law · Legal Education and Practice Innovations · Topic Modeling

MethodsSoftmax · Attention Is All You Need