Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis
Jiawei Wang, Kai Hu, Zhuoyao Zhong, Lei Sun, Qiang Huo

TL;DR
This paper introduces a tree construction approach for hierarchical document structure analysis, integrating detection, reading order prediction, and structure building, achieving state-of-the-art results on multiple datasets.
Contribution
It proposes a comprehensive end-to-end framework for hierarchical document analysis and introduces the Comp-HRDoc benchmark for evaluating such systems.
Findings
Achieved state-of-the-art performance on PubLayNet and DocLayNet datasets.
Developed a new benchmark, Comp-HRDoc, for hierarchical document structure evaluation.
Demonstrated effective simultaneous handling of multiple document analysis subtasks.
Abstract
Document structure analysis (aka document layout analysis) is crucial for understanding the physical layout and logical structure of documents, with applications in information retrieval, document summarization, knowledge extraction, etc. In this paper, we concentrate on Hierarchical Document Structure Analysis (HDSA) to explore hierarchical relationships within structured documents created using authoring software employing hierarchical schemas, such as LaTeX, Microsoft Word, and HTML. To comprehensively analyze hierarchical document structures, we propose a tree construction based approach that addresses multiple subtasks concurrently, including page object detection (Detect), reading order prediction of identified objects (Order), and the construction of intended hierarchical structure (Construct). We present an effective end-to-end solution based on this framework to demonstrate its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Retrieval and Classification Techniques · Web Data Mining and Analysis
