TL;DR
This paper investigates the maximum potential of contextual information to improve object detection, revealing that simple co-occurrence can help significantly, but spatial relations and localization errors limit overall gains.
Contribution
It introduces an optimization-based approach to assess the upper bounds of context utility in object detection and identifies when context provides meaningful improvements.
Findings
Simple co-occurrence relations often yield large detection gains.
Spatial relations and localization errors limit the effectiveness of context.
Context cannot significantly improve detection when dealing with localization errors.
Abstract
The recurring context in which objects appear holds valuable information that can be employed to predict their existence. This intuitive observation indeed led many researchers to endow appearance-based detectors with explicit reasoning about context. The underlying thesis suggests that stronger contextual relations would facilitate greater improvements in detection capacity. In practice, however, the observed improvement in many cases is modest at best, and often only marginal. In this work we seek to improve our understanding of this phenomenon, in part by pursuing an opposite approach. Instead of attempting to improve detection scores by employing context, we treat the utility of context as an optimization problem: to what extent can detection scores be improved by considering context or any other kind of additional information? With this approach we explore the bounds on improvement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
