CHANCERY: Evaluating Corporate Governance Reasoning Capabilities in Language Models

Lucas Irwin; Arda Kaz; Peiyao Sheng; Sewoong Oh; Pramod Viswanath

arXiv:2506.04636·cs.AI·June 13, 2025

CHANCERY: Evaluating Corporate Governance Reasoning Capabilities in Language Models

Lucas Irwin, Arda Kaz, Peiyao Sheng, Sewoong Oh, Pramod Viswanath

PDF

Open Access

TL;DR

This paper introduces CHANCERY, a novel benchmark for evaluating language models' ability to reason about corporate governance laws, revealing the current models' limitations and strengths in legal reasoning tasks.

Contribution

It presents the first corporate governance reasoning benchmark modeled after real-world law, with diverse data and analysis of model performance.

Findings

01

GPT-4o achieves 75.2% accuracy on the benchmark

02

Reasoning agents outperform standard models

03

Models struggle with complex legal reasoning questions

Abstract

Law has long been a domain that has been popular in natural language processing (NLP) applications. Reasoning (ratiocination and the ability to make connections to precedent) is a core part of the practice of the law in the real world. Nevertheless, while multiple legal datasets exist, none have thus far focused specifically on reasoning tasks. We focus on a specific aspect of the legal landscape by introducing a corporate governance reasoning benchmark (CHANCERY) to test a model's ability to reason about whether executive/board/shareholder's proposed actions are consistent with corporate governance charters. This benchmark introduces a first-of-its-kind corporate governance reasoning test for language models - modeled after real world corporate governance law. The benchmark consists of a corporate charter (a set of governing covenants) and a proposal for executive action. The model's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law · Explainable Artificial Intelligence (XAI) · Auditing, Earnings Management, Governance

MethodsFocus · Sparse Evolutionary Training