ACORD: An Expert-Annotated Retrieval Dataset for Legal Contract Drafting
Steven H. Wang, Maksim Zubkov, Kexin Fan, Sarah Harrell, Yuyang Sun, Wei Chen, Andreas Plesner, Roger Wattenhofer

TL;DR
ACORD is the first expert-annotated dataset for legal contract clause retrieval, providing a benchmark to improve retrieval systems in legal drafting, with promising initial results but room for advancement.
Contribution
Introduces ACORD, a comprehensive, expert-annotated retrieval dataset for complex legal contract clauses, filling a critical gap in legal NLP resources.
Findings
Bi-encoder retriever with LLM re-rankers shows promising results.
Substantial improvements are still needed for complex legal tasks.
ACORD serves as a valuable benchmark for future legal IR research.
Abstract
Information retrieval, specifically contract clause retrieval, is foundational to contract drafting because lawyers rarely draft contracts from scratch; instead, they locate and revise the most relevant precedent. We introduce the Atticus Clause Retrieval Dataset (ACORD), the first retrieval benchmark for contract drafting fully annotated by experts. ACORD focuses on complex contract clauses such as Limitation of Liability, Indemnification, Change of Control, and Most Favored Nation. It includes 114 queries and over 126,000 query-clause pairs, each ranked on a scale from 1 to 5 stars. The task is to find the most relevant precedent clauses to a query. The bi-encoder retriever paired with pointwise LLMs re-rankers shows promising results. However, substantial improvements are still needed to effectively manage the complex legal work typically undertaken by lawyers. As the first retrieval…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsArtificial Intelligence in Law · Multi-Agent Systems and Negotiation
