Cross-Domain Document Layout Analysis Using Document Style Guide
Xingjiao Wu, Luwei Xiao, Xiangcheng Du, Yingbin Zheng, Xin Li,, Tianlong Ma, Cheng Jin, Liang He

TL;DR
This paper presents an unsupervised cross-domain document layout analysis framework that leverages document style guidance, quality assessment, and contrastive learning to improve generalization across diverse document styles and layouts.
Contribution
It introduces a novel unsupervised framework combining style guidance, quality assessment, and contrastive learning for robust cross-domain document layout analysis.
Findings
Achieved remarkable performance on diverse datasets.
Effectively bridges the gap between synthetic and real document styles.
Outperforms existing methods in generalization capability.
Abstract
The document layout analysis (DLA) aims to decompose document images into high-level semantic areas (i.e., figures, tables, texts, and background). Creating a DLA framework with strong generalization capabilities is a challenge due to document objects are diversity in layout, size, aspect ratio, texture, etc. Many researchers devoted this challenge by synthesizing data to build large training sets. However, the synthetic training data has different styles and erratic quality. Besides, there is a large gap between the source data and the target data. In this paper, we propose an unsupervised cross-domain DLA framework based on document style guidance. We integrated the document quality assessment and the document cross-domain analysis into a unified framework. Our framework is composed of three components, Document Layout Generator (GLD), Document Elements Decorator(GED), and Document…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Handwritten Text Recognition Techniques · Advanced Image and Video Retrieval Techniques
MethodsContrastive Learning · Deep Layer Aggregation
