IIIT-AR-13K: A New Dataset for Graphical Object Detection in Documents
Ajoy Mondal, Peter Lipps, and C. V. Jawahar

TL;DR
The paper introduces IIIT-AR-13K, a large, diverse dataset of annotated business document images for graphical object detection, and benchmarks it with state-of-the-art models to establish baselines.
Contribution
It provides the largest manually annotated dataset for graphical objects in business documents, enabling improved detection methods and benchmarking.
Findings
High baseline performance with Faster R-CNN and Mask R-CNN
Demonstrates dataset's effectiveness for training graphical object detectors
Single model trained on this dataset outperforms larger data-trained models
Abstract
We introduce a new dataset for graphical object detection in business documents, more specifically annual reports. This dataset, IIIT-AR-13k, is created by manually annotating the bounding boxes of graphical or page objects in publicly available annual reports. This dataset contains a total of 13k annotated page images with objects in five different popular categories - table, figure, natural image, logo, and signature. It is the largest manually annotated dataset for graphical object detection. Annual reports created in multiple languages for several years from various companies bring high diversity into this dataset. We benchmark IIIT-AR-13K dataset with two state of the art graphical object detection techniques using Faster R-CNN [20] and Mask R-CNN [11] and establish high baselines for further research. Our dataset is highly effective as training data for developing practical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Advanced Neural Network Applications · Vehicle License Plate Recognition
MethodsRegion Proposal Network · RoIPool · Softmax · RoIAlign · Convolution · Mask R-CNN · Faster R-CNN
