Metadata Extraction Leveraging Large Language Models

Cuize Han; Sesh Jalagam

arXiv:2510.19334·stat.ML·October 23, 2025

Metadata Extraction Leveraging Large Language Models

Cuize Han, Sesh Jalagam

PDF

Open Access

TL;DR

This paper demonstrates how large language models can be effectively used to automate and improve the accuracy of metadata extraction in legal contract review, reducing time and costs.

Contribution

It introduces a comprehensive LLM-based approach for legal clause detection, integrating advanced techniques like CoT prompting and structured tool calling.

Findings

01

Significant improvement in clause identification accuracy

02

Enhanced efficiency in contract review processes

03

Potential to lower legal review costs

Abstract

The advent of Large Language Models has revolutionized tasks across domains, including the automation of legal document analysis, a critical component of modern contract management systems. This paper presents a comprehensive implementation of LLM-enhanced metadata extraction for contract review, focusing on the automatic detection and annotation of salient legal clauses. Leveraging both the publicly available Contract Understanding Atticus Dataset (CUAD) and proprietary contract datasets, our work demonstrates the integration of advanced LLM methodologies with practical applications. We identify three pivotal elements for optimizing metadata extraction: robust text conversion, strategic chunk selection, and advanced LLM-specific techniques, including Chain of Thought (CoT) prompting and structured tool calling. The results from our experiments highlight the substantial improvements in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law · Business Law and Ethics · Artificial Intelligence Applications