TL;DR
This paper critiques current object proposal evaluation methods for being 'gameable' and introduces a nearly-fully annotated dataset and diagnostic tools to better assess their true category independence and generalization.
Contribution
It presents a nearly-fully annotated dataset, evaluation protocols, and diagnostic tools to improve the assessment of object proposal algorithms beyond partial annotations.
Findings
Evaluation protocol is 'gameable' and does not reflect true category independence.
Nearly-fully annotated dataset reveals overfitting in existing methods.
Diagnostic tool detects bias capacity in object proposal algorithms.
Abstract
Object proposals have quickly become the de-facto pre-processing step in a number of vision pipelines (for object detection, object discovery, and other tasks). Their performance is usually evaluated on partially annotated datasets. In this paper, we argue that the choice of using a partially annotated dataset for evaluation of object proposals is problematic -- as we demonstrate via a thought experiment, the evaluation protocol is 'gameable', in the sense that progress under this protocol does not necessarily correspond to a "better" category independent object proposal algorithm. To alleviate this problem, we: (1) Introduce a nearly-fully annotated version of PASCAL VOC dataset, which serves as a test-bed to check if object proposal techniques are overfitting to a particular list of categories. (2) Perform an exhaustive evaluation of object proposal methods on our introduced…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Object-Proposal Evaluation Protocol is ‘Gameable’· youtube
