Pre-training, Fine-tuning and Re-ranking: A Three-Stage Framework for Legal Question Answering
Shiwen Ni, Hao Cheng, Min Yang

TL;DR
This paper introduces a three-stage framework combining pre-training, fine-tuning, and re-ranking to improve legal question answering by enhancing domain-specific text representations and retrieval accuracy.
Contribution
It proposes a novel three-stage framework (PFR-LQA) that leverages domain-specific pre-training, task fine-tuning, and contextual re-ranking to outperform existing legal QA methods.
Findings
Outperforms strong competitors on legal QA datasets.
Enhances dense retrieval with domain-specific pre-training.
Improves question re-ranking through contextual similarity.
Abstract
Legal question answering (QA) has attracted increasing attention from people seeking legal advice, which aims to retrieve the most applicable answers from a large-scale database of question-answer pairs. Previous methods mainly use a dual-encoder architecture to learn dense representations of both questions and answers. However, these methods could suffer from lacking domain knowledge and sufficient labeled training data. In this paper, we propose a three-stage (\underline{p}re-training, \underline{f}ine-tuning and \underline{r}e-ranking) framework for \underline{l}egal \underline{QA} (called PFR-LQA), which promotes the fine-grained text representation learning and boosts the performance of dense retrieval with the dual-encoder architecture. Concretely, we first conduct domain-specific pre-training on legal questions and answers through a self-supervised training objective, allowing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Legal Education and Practice Innovations · Topic Modeling
MethodsSoftmax · Attention Is All You Need
