A Century of Evolution in the Complexity of the United States Legal Code
Dawoon Jeong, James Holehouse, Jisung Yoon, Chris Kempes, Geoffrey B West, Hyejin Youn

TL;DR
This paper presents a comprehensive, versioned dataset of the U.S. Code from 1926 to 2023, capturing its text, structure, and complexity to facilitate long-term analysis of legal system evolution.
Contribution
It provides the first long-term, detailed, machine-readable record of the U.S. legal code, including structural and linguistic features for interdisciplinary research.
Findings
Dataset covers 1926-2023 with detailed complexity metrics
Enables analysis of legal code growth and reorganization over time
Supports research in legal studies, data science, and complexity science
Abstract
As societies confront increasingly complex regulatory demands in domains such as digital governance, climate policy, and public health, there is a pressing need to understand how legal systems evolve, where they concentrate regulatory attention, and how their institutional architectures shape capacity for adaptation. Yet, the long-term structural dynamics of law remain empirically underexplored. Here, we provide a versioned, machine-readable record of the United States Code (U.S. Code), the primary compilation of federal statutory law in the United States, covering the entire history of the Code from 1926 to 2023. We include not only the curated text in Code but also its structural and linguistic complexity: word counts, vocabulary statistics, hierarchical organization (titles, chapters, sections, subsections), and cross-references among titles. In this way, the dataset offers an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
