Legal Transformer Models May Not Always Help

Saibo Geng; R\'emi Lebret; Karl Aberer

arXiv:2109.06862·cs.CL·September 16, 2021·5 cites

Legal Transformer Models May Not Always Help

Saibo Geng, R\'emi Lebret, Karl Aberer

PDF

Open Access

TL;DR

This paper evaluates the effectiveness of domain adaptive pre-training and language adapters in legal NLP tasks, finding they are beneficial mainly for low-resource tasks and can reduce training costs, with the release of LegalRoBERTa.

Contribution

It provides a comprehensive benchmark of domain adaptive pre-training and adapters in legal NLP, highlighting their specific benefits and limitations.

Findings

01

Domain adaptive pre-training helps only low-resource tasks.

02

Adapters achieve similar performance to full tuning with less cost.

03

LegalRoBERTa is a new pre-trained legal language model.

Abstract

Deep learning-based Natural Language Processing methods, especially transformers, have achieved impressive performance in the last few years. Applying those state-of-the-art NLP methods to legal activities to automate or simplify some simple work is of great value. This work investigates the value of domain adaptive pre-training and language adapters in legal NLP tasks. By comparing the performance of language models with domain adaptive pre-training on different tasks and different dataset splits, we show that domain adaptive pre-training is only helpful with low-resource downstream tasks, thus far from being a panacea. We also benchmark the performance of adapters in a typical legal NLP task and show that they can yield similar performance to full model tuning with much smaller training costs. As an additional result, we release LegalRoBERTa, a RoBERTa model further pre-trained on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Law · Natural Language Processing Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Linear Warmup With Linear Decay · Weight Decay · Attention Dropout · Dropout · Layer Normalization · Softmax · Residual Connection