Cross-Domain Document Object Detection: Benchmark Suite and Method

Kai Li; Curtis Wigington; Chris Tensmeyer; Handong Zhao; Nikolaos; Barmpalios; Vlad I. Morariu; Varun Manjunatha; Tong Sun; Yun Fu

arXiv:2003.13197·cs.CV·March 31, 2020·5 cites

Cross-Domain Document Object Detection: Benchmark Suite and Method

Kai Li, Curtis Wigington, Chris Tensmeyer, Handong Zhao, Nikolaos, Barmpalios, Vlad I. Morariu, Varun Manjunatha, Tong Sun, Yun Fu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a benchmark suite and a novel cross-domain document object detection method that effectively handles domain shifts by aligning features, regions, and rendering layers, significantly improving detection performance across diverse document datasets.

Contribution

The paper establishes a comprehensive benchmark suite for cross-domain document object detection and proposes a new detection model with three alignment modules to address domain shifts.

Findings

01

The proposed method outperforms baseline models on the benchmark suite.

02

The three alignment modules significantly improve detection accuracy.

03

Extensive experiments validate the effectiveness of the approach.

Abstract

Decomposing images of document pages into high-level semantic regions (e.g., figures, tables, paragraphs), document object detection (DOD) is fundamental for downstream tasks like intelligent document editing and understanding. DOD remains a challenging problem as document objects vary significantly in layout, size, aspect ratio, texture, etc. An additional challenge arises in practice because large labeled training datasets are only available for domains that differ from the target domain. We investigate cross-domain DOD, where the goal is to learn a detector for the target domain using labeled data from the source domain and only unlabeled data from the target domain. Documents from the two domains may vary significantly in layout, language, and genre. We establish a benchmark suite consisting of different types of PDF document datasets that can be utilized for cross-domain DOD model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kailigo/cddod
pytorchOfficial

Videos

Cross-Domain Document Object Detection: Benchmark Suite and Method· youtube

Taxonomy

TopicsHandwritten Text Recognition Techniques · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques