Unveiling Document Structures with YOLOv5 Layout Detection

Herman Sugiharto; Yorissa Silviana; Yani Siti Nurpazrin

arXiv:2309.17033·cs.CV·October 2, 2023·1 cites

Unveiling Document Structures with YOLOv5 Layout Detection

Herman Sugiharto, Yorissa Silviana, Yani Siti Nurpazrin

PDF

Open Access

TL;DR

This paper demonstrates that YOLOv5 can effectively identify and extract layout components from document images, significantly improving unstructured data processing in various sectors.

Contribution

It introduces a novel framework using YOLOv5 for document layout detection, achieving high accuracy and precision in recognizing document elements.

Findings

01

High accuracy (0.91) and recall (0.971) in layout detection

02

F1-score of 0.939 indicating balanced performance

03

AUC-ROC of 0.975 demonstrating excellent classification capability

Abstract

The current digital environment is characterized by the widespread presence of data, particularly unstructured data, which poses many issues in sectors including finance, healthcare, and education. Conventional techniques for data extraction encounter difficulties in dealing with the inherent variety and complexity of unstructured data, hence requiring the adoption of more efficient methodologies. This research investigates the utilization of YOLOv5, a cutting-edge computer vision model, for the purpose of rapidly identifying document layouts and extracting unstructured data. The present study establishes a conceptual framework for delineating the notion of "objects" as they pertain to documents, incorporating various elements such as paragraphs, tables, photos, and other constituent parts. The main objective is to create an autonomous system that can effectively recognize document…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Currency Recognition and Detection · Vehicle License Plate Recognition