Encoded Summarization: Summarizing Documents into Continuous Vector   Space for Legal Case Retrieval

Vu Tran; Minh Le Nguyen; Satoshi Tojo; and Ken Satoh

arXiv:2309.08187·cs.CL·September 18, 2023

Encoded Summarization: Summarizing Documents into Continuous Vector Space for Legal Case Retrieval

Vu Tran, Minh Le Nguyen, Satoshi Tojo, and Ken Satoh

PDF

TL;DR

This paper introduces a novel method for legal case retrieval by encoding documents into a continuous vector space through summarization and neural network-based phrase scoring, improving retrieval accuracy.

Contribution

The paper proposes a new encoding technique combining lexical and neural features for legal document summarization and retrieval, demonstrating improved performance.

Findings

01

Lexical and neural features complement each other for better retrieval.

02

Encoded summarization enhances legal case retrieval accuracy.

03

Achieved F1 scores of 65.6% and 57.6% on datasets.

Abstract

We present our method for tackling a legal case retrieval task by introducing our method of encoding documents by summarizing them into continuous vector space via our phrase scoring framework utilizing deep neural networks. On the other hand, we explore the benefits from combining lexical features and latent features generated with neural networks. Our experiments show that lexical features and latent features generated with neural networks complement each other to improve the retrieval system performance. Furthermore, our experimental results suggest the importance of case summarization in different aspects: using provided summaries and performing encoded summarization. Our approach achieved F1 of 65.6% and 57.6% on the experimental datasets of legal case retrieval tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.