Building Legal Case Retrieval Systems with Lexical Matching and Summarization using A Pre-Trained Phrase Scoring Model
Vu Tran, Minh Le Nguyen, Ken Satoh

TL;DR
This paper introduces a legal case retrieval system that combines lexical matching and summarization-based encoding, leveraging a pre-trained phrase scoring model to improve retrieval accuracy and achieve state-of-the-art results.
Contribution
The paper presents a novel combination of lexical features and summarization-based document encoding for legal case retrieval, trained on COLIEE 2018 data.
Findings
Achieved state-of-the-art performance on the COLIEE 2019 legal case retrieval benchmark.
Combining lexical features with summarization-based latent features improves retrieval accuracy.
The summarization encoding captures essential document properties beneficial for retrieval tasks.
Abstract
We present our method for tackling the legal case retrieval task of the Competition on Legal Information Extraction/Entailment 2019. Our approach is based on the idea that summarization is important for retrieval. On one hand, we adopt a summarization based model called encoded summarization which encodes a given document into continuous vector space which embeds the summary properties of the document. We utilize the resource of COLIEE 2018 on which we train the document representation model. On the other hand, we extract lexical features on different parts of a given query and its candidates. We observe that by comparing different parts of the query and its candidates, we can achieve better performance. Furthermore, the combination of the lexical features with latent features by the summarization-based method achieves even better performance. We have achieved the state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
