# Scaling Legal AI: Benchmarking Mamba and Transformers for Statutory Classification and Case Law Retrieval

**Authors:** Anuraj Maurya

arXiv: 2509.00141 · 2025-09-03

## TL;DR

This paper benchmarks Mamba, a linear-time state-space model, against transformer models for legal NLP tasks, demonstrating Mamba's superior scalability and competitive performance on long legal documents.

## Contribution

It introduces the first comprehensive benchmark comparing Mamba with transformers for legal NLP, highlighting Mamba's efficiency and scalability advantages.

## Key findings

- Mamba processes longer legal documents than transformers.
- Mamba maintains or surpasses transformer performance in classification and retrieval.
- The study provides a new benchmark suite and open-source resources for long-context legal NLP.

## Abstract

The rapid growth of statutory corpora and judicial decisions requires scalable legal AI systems capable of classification and retrieval over extremely long contexts. Transformer-based architectures (e.g., Longformer, DeBERTa) dominate current legal NLP benchmarks but struggle with quadratic attention costs, limiting efficiency and scalability. In this work, we present the first comprehensive benchmarking of Mamba, a state-space model (SSM) with linear-time selective mechanisms, against leading transformer models for statutory classification and case law retrieval. We evaluate models on open-source legal corpora including LexGLUE, EUR-Lex, and ILDC, covering statutory tagging, judicial outcome prediction, and case retrieval tasks. Metrics include accuracy, recall at k, mean reciprocal rank (MRR), and normalized discounted cumulative gain (nDCG), alongside throughput measured in tokens per second and maximum context length. Results show that Mamba's linear scaling enables processing of legal documents several times longer than transformers, while maintaining or surpassing retrieval and classification performance. This study introduces a new legal NLP benchmark suite for long-context modeling, along with open-source code and datasets to support reproducibility. Our findings highlight trade-offs between state-space models and transformers, providing guidance for deploying scalable legal AI in statutory analysis, judicial decision support, and policy research.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2509.00141/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/2509.00141/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/2509.00141/full.md

---
Source: https://tomesphere.com/paper/2509.00141