ACORD: An Expert-Annotated Retrieval Dataset for Legal Contract Drafting

Steven H. Wang; Maksim Zubkov; Kexin Fan; Sarah Harrell; Yuyang Sun; Wei Chen; Andreas Plesner; Roger Wattenhofer

arXiv:2501.06582·cs.CL·September 23, 2025

ACORD: An Expert-Annotated Retrieval Dataset for Legal Contract Drafting

Steven H. Wang, Maksim Zubkov, Kexin Fan, Sarah Harrell, Yuyang Sun, Wei Chen, Andreas Plesner, Roger Wattenhofer

PDF

Open Access 1 Video

TL;DR

ACORD is the first expert-annotated dataset for legal contract clause retrieval, providing a benchmark to improve retrieval systems in legal drafting, with promising initial results but room for advancement.

Contribution

Introduces ACORD, a comprehensive, expert-annotated retrieval dataset for complex legal contract clauses, filling a critical gap in legal NLP resources.

Findings

01

Bi-encoder retriever with LLM re-rankers shows promising results.

02

Substantial improvements are still needed for complex legal tasks.

03

ACORD serves as a valuable benchmark for future legal IR research.

Abstract

Information retrieval, specifically contract clause retrieval, is foundational to contract drafting because lawyers rarely draft contracts from scratch; instead, they locate and revise the most relevant precedent. We introduce the Atticus Clause Retrieval Dataset (ACORD), the first retrieval benchmark for contract drafting fully annotated by experts. ACORD focuses on complex contract clauses such as Limitation of Liability, Indemnification, Change of Control, and Most Favored Nation. It includes 114 queries and over 126,000 query-clause pairs, each ranked on a scale from 1 to 5 stars. The task is to find the most relevant precedent clauses to a query. The bi-encoder retriever paired with pointwise LLMs re-rankers shows promising results. However, substantial improvements are still needed to effectively manage the complex legal work typically undertaken by lawyers. As the first retrieval…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

ACORD: An Expert-Annotated Retrieval Dataset for Legal Contract Drafting· underline

Taxonomy

TopicsArtificial Intelligence in Law · Multi-Agent Systems and Negotiation