Natural Language Processing for the Legal Domain: A Survey of Tasks, Datasets, Models, and Challenges

Farid Ariai; Joel Mackenzie; Gianluca Demartini

arXiv:2410.21306·cs.CL·December 12, 2025

Natural Language Processing for the Legal Domain: A Survey of Tasks, Datasets, Models, and Challenges

Farid Ariai, Joel Mackenzie, Gianluca Demartini

PDF

TL;DR

This survey reviews 131 studies on NLP in the legal domain, highlighting tasks, datasets, models, and challenges, and discusses the unique aspects and open issues in applying NLP to legal texts.

Contribution

It provides a comprehensive overview of NLP tasks, models, and challenges specific to legal texts, including analysis of legal-oriented language models and adaptation approaches.

Findings

01

Legal NLP faces unique challenges like complex language and limited datasets.

02

Legal-oriented language models are being developed and adapted for legal tasks.

03

Open challenges include bias detection, model interpretability, and explainability in legal NLP.

Abstract

Natural Language Processing (NLP) is revolutionising the way both professionals and laypersons operate in the legal field. The considerable potential for NLP in the legal sector, especially in developing computational assistance tools for various legal processes, has captured the interest of researchers for years. This survey follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses framework, reviewing 154 studies, with a final selection of 131 after manual filtering. It explores foundational concepts related to NLP in the legal domain, illustrating the unique aspects and challenges of processing legal texts, such as extensive document lengths, complex language, and limited open legal datasets. We provide an overview of NLP tasks specific to legal text, such as Document Summarisation, Named Entity Recognition, Question Answering, Argument Mining, Text…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.