A Systematic Evaluation of Object Detection Networks for Scientific Plots
Pritha Ganguly, Nitesh Methani, Mitesh M. Khapra, Pratyush Kumar

TL;DR
This paper evaluates the effectiveness of existing object detection networks on scientific plots, identifies their limitations especially at high localization precision, and proposes novel modifications to improve accuracy and efficiency for reasoning tasks.
Contribution
The paper introduces a series of modifications including a Laplacian edge-based region proposal method and a custom loss function, significantly enhancing detection accuracy and speed for scientific plot analysis.
Findings
Achieved 93.44% mAP at 0.9 IOU with the final model.
Model inference time is 16 times faster than existing detectors.
Performance on text objects remains a challenge.
Abstract
Are existing object detection methods adequate for detecting text and visual elements in scientific plots which are arguably different than the objects found in natural images? To answer this question, we train and compare the accuracy of various SOTA object detection networks on the PlotQA dataset. At the standard IOU setting of 0.5, most networks perform well with mAP scores greater than 80% in detecting the relatively simple objects in plots. However, the performance drops drastically when evaluated at a stricter IOU of 0.9 with the best model giving a mAP of 35.70%. Note that such a stricter evaluation is essential when dealing with scientific plots where even minor localisation errors can lead to large errors in downstream numerical inferences. Given this poor performance, we propose minor modifications to existing models by combining ideas from different object detection networks.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Handwritten Text Recognition Techniques · Advanced Image and Video Retrieval Techniques
MethodsNon Maximum Suppression · Convolution · 1x1 Convolution · Focal Loss · SSD · Feature Pyramid Network · RetinaNet
