The Law of Large Documents: Understanding the Structure of Legal Contracts Using Visual Cues
Allison Hegel, Marina Shah, Genevieve Peaslee, Brendan Roof, Emad, Elwany

TL;DR
This paper demonstrates that incorporating visual cues like layout and style significantly improves the understanding of long legal documents by using computer vision techniques for segmentation and information extraction.
Contribution
The study introduces a novel approach that leverages visual cues for segmenting and understanding long legal documents, outperforming existing methods.
Findings
Visual cues improve document segmentation accuracy
Enhanced understanding of legal contracts using visual features
Outperforms existing methods on four long-document tasks
Abstract
Large, pre-trained transformer models like BERT have achieved state-of-the-art results on document understanding tasks, but most implementations can only consider 512 tokens at a time. For many real-world applications, documents can be much longer, and the segmentation strategies typically used on longer documents miss out on document structure and contextual information, hurting their results on downstream tasks. In our work on legal agreements, we find that visual cues such as layout, style, and placement of text in a document are strong features that are crucial to achieving an acceptable level of accuracy on long documents. We measure the impact of incorporating such visual cues, obtained via computer vision methods, on the accuracy of document understanding tasks including document segmentation, entity extraction, and attribute classification. Our method of segmenting documents…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Law in Society and Culture · Natural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Weight Decay · Multi-Head Attention · Linear Warmup With Linear Decay · Residual Connection · Dense Connections · Softmax · WordPiece
