AI for Statutory Simplification: A Comprehensive State Legal Corpus and Labor Benchmark
Emaan Hariri, Daniel E. Ho

TL;DR
This paper introduces LaborBench, a new benchmark dataset for evaluating AI's ability to simplify complex statutory language, and assesses the performance of large language models in this domain, revealing current limitations.
Contribution
It presents LaborBench and StateCodes datasets for systematic evaluation of AI in statutory simplification, and benchmarks LLMs, highlighting their current shortcomings.
Findings
LLMs show promise but lack sufficient accuracy for regulatory simplification.
LaborBench enables systematic evaluation of AI in legal code simplification.
Current models are not yet reliable as end-to-end solutions for legal code simplification.
Abstract
One of the emerging use cases of AI in law is for code simplification: streamlining, distilling, and simplifying complex statutory or regulatory language. One U.S. state has claimed to eliminate one third of its state code using AI. Yet we lack systematic evaluations of the accuracy, reliability, and risks of such approaches. We introduce LaborBench, a question-and-answer benchmark dataset designed to evaluate AI capabilities in this domain. We leverage a unique data source to create LaborBench: a dataset updated annually by teams of lawyers at the U.S. Department of Labor, who compile differences in unemployment insurance laws across 50 states for over 101 dimensions in a six-month process, culminating in a 200-page publication of tables. Inspired by our collaboration with one U.S. state to explore using large language models (LLMs) to simplify codes in this domain, where complexity is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
