UnSupDLA: Towards Unsupervised Document Layout Analysis

Talha Uddin Sheikh; Tahira Shehzadi; Khurram Azeem Hashmi; Didier; Stricker; Muhammad Zeshan Afzal

arXiv:2406.06236·cs.CV·June 11, 2024

UnSupDLA: Towards Unsupervised Document Layout Analysis

Talha Uddin Sheikh, Tahira Shehzadi, Khurram Azeem Hashmi, Didier, Stricker, Muhammad Zeshan Afzal

PDF

Open Access

TL;DR

This paper introduces UnSupDLA, an unsupervised method for document layout analysis that leverages vision-based pre-training and iterative refinement to improve detection and segmentation without labeled data.

Contribution

It presents a novel unsupervised training framework for document layout analysis, reducing reliance on labeled datasets and enhancing accuracy through iterative self-training.

Findings

01

Improved detection and segmentation accuracy on document datasets.

02

Reduced need for labeled data in layout analysis.

03

Enhanced efficiency in processing diverse online documents.

Abstract

Document layout analysis is a key area in document research, involving techniques like text mining and visual analysis. Despite various methods developed to tackle layout analysis, a critical but frequently overlooked problem is the scarcity of labeled data needed for analyses. With the rise of internet use, an overwhelming number of documents are now available online, making the process of accurately labeling them for research purposes increasingly challenging and labor-intensive. Moreover, the diversity of documents online presents a unique set of challenges in maintaining the quality and consistency of these labels, further complicating document layout analysis in the digital era. To address this, we employ a vision-based approach for analyzing document layouts designed to train a network without labels. Instead, we focus on pre-training, initially generating simple object masks from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management

MethodsSparse Evolutionary Training · Focus