PromptDLA: A Domain-aware Prompt Document Layout Analysis Framework with Descriptive Knowledge as a Cue
Zirui Zhang, Yaping Zhang, Lu Xiang, Yang Zhao, Feifei Zhai, Yu Zhou, and Chengqing Zong

TL;DR
PromptDLA introduces a domain-aware prompting framework that leverages descriptive knowledge to improve document layout analysis across diverse domains, addressing the limitations of dataset merging.
Contribution
It proposes a novel domain-aware prompter that incorporates domain priors into DLA, enhancing cross-domain generalization and achieving state-of-the-art results.
Findings
Achieves state-of-the-art performance on multiple DLA datasets.
Effectively leverages descriptive knowledge as cues for domain adaptation.
Improves model generalization across varied document types and languages.
Abstract
Document Layout Analysis (DLA) is crucial for document artificial intelligence and has recently received increasing attention, resulting in an influx of large-scale public DLA datasets. Existing work often combines data from various domains in recent public DLA datasets to improve the generalization of DLA. However, directly merging these datasets for training often results in suboptimal model performance, as it overlooks the different layout structures inherent to various domains. These variations include different labeling styles, document types, and languages. This paper introduces PromptDLA, a domain-aware Prompter for Document Layout Analysis that effectively leverages descriptive knowledge as cues to integrate domain priors into DLA. The innovative PromptDLA features a unique domain-aware prompter that customizes prompts based on the specific attributes of the data domain. These…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Topic Modeling · Advanced Neural Network Applications
