ContractNLI: A Dataset for Document-level Natural Language Inference for Contracts
Yuta Koreeda, Christopher D. Manning

TL;DR
This paper introduces ContractNLI, a large dataset for document-level natural language inference on contracts, highlighting the challenges and proposing a strong baseline model that improves evidence identification in lengthy legal documents.
Contribution
The paper presents the first large-scale dataset for contract-level NLI and analyzes the limitations of existing models, proposing a novel span-based evidence identification method.
Findings
Existing models perform poorly on contract NLI.
A new span-based evidence identification approach improves performance.
Contracts' linguistic features like negations increase task difficulty.
Abstract
Reviewing contracts is a time-consuming procedure that incurs large expenses to companies and social inequality to those who cannot afford it. In this work, we propose "document-level natural language inference (NLI) for contracts", a novel, real-world application of NLI that addresses such problems. In this task, a system is given a set of hypotheses (such as "Some obligations of Agreement may survive termination.") and a contract, and it is asked to classify whether each hypothesis is "entailed by", "contradicting to" or "not mentioned by" (neutral to) the contract as well as identifying "evidence" for the decision as spans in the contract. We annotated and release the largest corpus to date consisting of 607 annotated contracts. We then show that existing models fail badly on our task and introduce a strong baseline, which (1) models evidence identification as multi-label…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Topic Modeling · Natural Language Processing Techniques
