LED: A Benchmark for Evaluating Layout Error Detection in Document Analysis

Inbum Heo; Taewook Hwang; Jeesu Jung; Sangkeun Jung

arXiv:2603.17265·cs.CV·March 19, 2026

LED: A Benchmark for Evaluating Layout Error Detection in Document Analysis

Inbum Heo, Taewook Hwang, Jeesu Jung, Sangkeun Jung

PDF

Open Access

TL;DR

This paper introduces LED, a comprehensive benchmark for evaluating structural error detection in document layout analysis, addressing limitations of traditional metrics by focusing on logical inconsistencies and reasoning capabilities.

Contribution

We propose LED, a new benchmark with standardized error types, realistic error simulation, and evaluation tasks to assess structural reasoning in document layout analysis models.

Findings

01

State-of-the-art models show weaknesses in structural understanding.

02

LED enables detailed assessment of model reasoning capabilities.

03

Benchmark reveals modality and architecture-specific deficiencies.

Abstract

Recent advances in Large Language Models (LLMs) and Large Multimodal Models (LMMs) have improved Document Layout Analysis (DLA), yet structural errors such as region merging, splitting, and omission remain persistent. Conventional overlap-based metrics (e.g., IoU, mAP) fail to capture such logical inconsistencies. To overcome this limitation, we propose Layout Error Detection (LED), a benchmark that evaluates structural reasoning in DLA predictions beyond surface-level accuracy. LED defines eight standardized error types (Missing, Hallucination, Size Error, Split, Merge, Overlap, Duplicate, and Misclassification) and provides quantitative rules and injection algorithms for realistic error simulation. Using these definitions, we construct LED-Dataset and design three evaluation tasks: document-level error detection, document-level error-type classification, and element-level error-type…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Materials Science