CHANCERY: Evaluating Corporate Governance Reasoning Capabilities in Language Models
Lucas Irwin, Arda Kaz, Peiyao Sheng, Sewoong Oh, Pramod Viswanath

TL;DR
This paper introduces CHANCERY, a novel benchmark for evaluating language models' ability to reason about corporate governance laws, revealing the current models' limitations and strengths in legal reasoning tasks.
Contribution
It presents the first corporate governance reasoning benchmark modeled after real-world law, with diverse data and analysis of model performance.
Findings
GPT-4o achieves 75.2% accuracy on the benchmark
Reasoning agents outperform standard models
Models struggle with complex legal reasoning questions
Abstract
Law has long been a domain that has been popular in natural language processing (NLP) applications. Reasoning (ratiocination and the ability to make connections to precedent) is a core part of the practice of the law in the real world. Nevertheless, while multiple legal datasets exist, none have thus far focused specifically on reasoning tasks. We focus on a specific aspect of the legal landscape by introducing a corporate governance reasoning benchmark (CHANCERY) to test a model's ability to reason about whether executive/board/shareholder's proposed actions are consistent with corporate governance charters. This benchmark introduces a first-of-its-kind corporate governance reasoning test for language models - modeled after real world corporate governance law. The benchmark consists of a corporate charter (a set of governing covenants) and a proposal for executive action. The model's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Explainable Artificial Intelligence (XAI) · Auditing, Earnings Management, Governance
MethodsFocus · Sparse Evolutionary Training
