Data-efficient End-to-end Information Extraction for Statistical Legal Analysis
Wonseok Hwang, Saehee Eom, Hanuhl Lee, Hai Jin Park, Minjoon Seo

TL;DR
This paper introduces a versatile, data-efficient end-to-end information extraction system for legal documents that improves statistical analysis accuracy with minimal training data, demonstrated on Korean legal precedents.
Contribution
The proposed IE system reformulates information extraction as a generation task, enabling domain-agnostic application with minimal engineering and low data requirements.
Findings
Achieves comparable performance with as few as 50 training examples.
Outperforms rule-based baseline by +5.4 on average with 200 examples.
Effectively captures macro-level features of Korean legal system.
Abstract
Legal practitioners often face a vast amount of documents. Lawyers, for instance, search for appropriate precedents favorable to their clients, while the number of legal precedents is ever-growing. Although legal search engines can assist finding individual target documents and narrowing down the number of candidates, retrieved information is often presented as unstructured text and users have to examine each document thoroughly which could lead to information overloading. This also makes their statistical analysis challenging. Here, we present an end-to-end information extraction (IE) system for legal documents. By formulating IE as a generation task, our system can be easily applied to various tasks without domain-specific engineering effort. The experimental results of four IE tasks on Korean precedents shows that our IE system can achieve competent scores (-2.3 on average) compared…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Topic Modeling
