Transformer-based Approaches for Legal Text Processing

Ha-Thanh Nguyen; Minh-Phuong Nguyen; Thi-Hai-Yen Vuong; Minh-Quan Bui,; Minh-Chau Nguyen; Tran-Binh Dang; Vu Tran; Le-Minh Nguyen; Ken Satoh

arXiv:2202.06397·cs.CL·February 15, 2022

Transformer-based Approaches for Legal Text Processing

Ha-Thanh Nguyen, Minh-Phuong Nguyen, Thi-Hai-Yen Vuong, Minh-Quan Bui,, Minh-Chau Nguyen, Tran-Binh Dang, Vu Tran, Le-Minh Nguyen, Ken Satoh

PDF

TL;DR

This paper explores Transformer-based models for legal text processing, demonstrating their effectiveness in legal NLP tasks and introducing two specialized pretrained models that achieve state-of-the-art results.

Contribution

It presents novel Transformer-based approaches and two pretrained models leveraging legal domain translations, advancing automated legal document processing.

Findings

01

Transformer models perform well on legal NLP tasks

02

NFSP achieves state-of-the-art in Task 5 of COLIEE 2021

03

Proposed methods can be useful references for legal NLP applications

Abstract

In this paper, we introduce our approaches using Transformer-based models for different problems of the COLIEE 2021 automatic legal text processing competition. Automated processing of legal documents is a challenging task because of the characteristics of legal documents as well as the limitation of the amount of data. With our detailed experiments, we found that Transformer-based pretrained language models can perform well with automated legal text processing problems with appropriate approaches. We describe in detail the processing steps for each task such as problem formulation, data processing and augmentation, pretraining, finetuning. In addition, we introduce to the community two pretrained models that take advantage of parallel translations in legal domain, NFSP and NMSP. In which, NFSP achieves the state-of-the-art result in Task 5 of the competition. Although the paper focuses…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.